Deleting lines from one file which are in another file
Try comm instead (assuming f1 and f2 are "already sorted")
comm -2 -3 f1 f2
grep -v -x -f f2 f1
should do the trick.
Explanation:
-v
to select non-matching lines-x
to match whole lines only-f f2
to get patterns fromf2
One can instead use grep -F
or fgrep
to match fixed strings from f2
rather than patterns (in case you want remove the lines in a "what you see if what you get" manner rather than treating the lines in f2
as regex patterns).
For exclude files that aren't too huge, you can use AWK's associative arrays.
awk 'NR == FNR { list[tolower($0)]=1; next } { if (! list[tolower($0)]) print }' exclude-these.txt from-this.txt
The output will be in the same order as the "from-this.txt" file. The tolower()
function makes it case-insensitive, if you need that.
The algorithmic complexity will probably be O(n) (exclude-these.txt size) + O(n) (from-this.txt size)