How to use grep to search for a line with one of two words but not both?
With GNU awk
:
$ printf '%s\n' {foo,bar}{bar,foo} neither | gawk 'xor(/foo/,/bar/)'
foofoo
barbar
Or portably:
awk '((/foo/) + (/bar/)) % 2'
With a grep
with support for -P
(PCRE):
grep -P '^((?=.*foo)(?!.*bar)|(?=.*bar)(?!.*foo))'
With sed
:
sed '
/foo/{
/bar/d
b
}
/bar/!d'
If you want to consider whole words only (that there is neither foo
nor bar
in foobar
or barbar
for instance), you'd need to decide how those words are delimited. If it's by any character other than letters, digits and underscore like the -w
option of many grep
implementation does, then you'd change those to:
gawk 'xor(/\<foo\>/,/\<bar\>/)'
awk '((/(^|[^[:alnum:]_)foo([^[:alnum:]_]|$)/) + \
(/(^|[^[:alnum:]_)bar([^[:alnum:]_]|$)/)) % 2'
grep -P '^((?=.*\bfoo\b)(?!.*\bbar\b)|(?=.*\bbar\b)(?!.*\bfoo\b))'
For sed
that becomes a bit complicated unless you have a sed
implementation like GNU sed
that supports \<
/\>
as word boundaries like GNU awk
does.
grep 'word1\|word2' text.txt
searches for lines containing word1
or word2
. This includes lines that contain both.
grep word1 text.txt | grep word2
searches for lines containing word1
and word2
. The two words can overlap (e.g. foobar
contains foo
and ob
). Another way to search for lines containing both words, but only in a non-overlapping way, is to search for them in either order: grep 'word1.*word2\|word2.*word1' text.txt
grep word1 text.txt | grep -v word2
searches for lines containing word1
but not word2
. The -v
option tells grep to keep non-matching lines and remove matching lines, instead of the opposite. This gives you half the results you wanted. By adding the symmetric search, you get all the lines containing exactly one of the words.
grep word1 text.txt | grep -v word2
grep word2 text.txt | grep -v word1
Alternatively, you can start from the lines containing either word, and remove the lines containing both words. Given the building blocks above, this is easy if the words don't overlap.
grep 'word1\|word2' text.txt | grep -v 'word1.*word2\|word2.*word1'
A bash solution:
#!/bin/bash
while (( $# )); do
a=0 ; [[ $1 =~ foo ]] && a=1
b=0 ; [[ $1 =~ bar ]] && b=1
(( a ^ b )) && echo "$1"
shift
done
To test it:
$ ./script {foo,bar}\ {foo,bar} neither
foo foo
bar bar