Tool in unix to subtract text files?
You can use grep
. Give it the small file as input and tell it to find non-matching lines:
grep -vxFf file.txt bigfile.txt > newbigfile.txt
The options used are:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by
newlines, any of which is to be matched. (-F is specified by
POSIX.)
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing. (-f is
specified by POSIX.)
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v
is specified by POSIX.)
-x, --line-regexp
Select only those matches that exactly match the whole line.
(-x is specified by POSIX.)
comm
is your friend:
NAME comm - compare two sorted files line by line
SYNOPSIS comm [OPTION]... FILE1 FILE2
DESCRIPTION Compare sorted files FILE1 and FILE2 line by line.
With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files. -1 suppress column 1 (lines unique to FILE1) -2 suppress column 2 (lines unique to FILE2) -3 suppress column 3 (lines that appear in both files)
(comm
will probably have a performance benefit over grep
since it takes the sortedness into account.)
For example:
comm -1 -3 file.txt bigfile.txt > newbigfile.txt