I want to compare values of two files, but not based on position or sequence

Compare the sorted files.

In bash (or ksh or zsh), with a process substitution:

diff <(sort File1.txt) <(sort File2.txt)

In plain sh:

sort File1.txt >File1.txt.sorted
sort File1.txt >File2.txt.sorted
diff File1.txt.sorted File2.txt.sorted

To quickly see the differences between sorted files, comm can be useful: it shows directly the lines that are in one file but not the other.

comm -12  <(sort File1.txt) <(sort File2.txt) >common-lines.txt
comm -23  <(sort File1.txt) <(sort File2.txt) >only-in-file-1.txt
comm -13  <(sort File1.txt) <(sort File2.txt) >only-in-file-2.txt

If a line is repeated in the same file, the commands above insist on the two files having the same number of repetitions. If you want to treat

foo
bar
foo

as identical to

bar
foo

then remove duplicates when sorting: use sort -u instead of sort.

If you save the output of sort on one file and use it later when the other file is available, note that the two files must be sorted in the same locale. If you do this, you should probably sort in byte order:

LC_ALL=C sort File1.txt >File1.txt.sorted

Sort the files first (in bash):

diff <(sort file1) <(sort file2)

Using awk, you can make a hash index of every distinct input line text, using a command like:

awk 'The magic' Q=A fileA Q=B fileB Q=C fileC ...

'The magic' per input line is:

{ X[$0] = X[$0] Q; }

When you get to the END condition, you iterate over the index of X. Any line that occurred exactly once in each file will be like:

X["Apple"] = "ABC";

A line that appeared once in fileA and three times in fileC would present as "ACCC". You can report any anomalies any way you like, for any number of files. (I once had to implement a 14-way comparison on a safety-critical system that ran on a Main and Standby server, each with a real-time plus Oracle database.)

If you choose to include the line number NR on each tag, and write some interesting patterns, you can make the tags look like:

X["Walrus"] = "A347B38C90"

and report which matching texts were on which lines in the various files.

I want to compare values of two files, but not based on position or sequence

Tags:

Shell

Bash

Diff

File Comparison

Related

Recent Posts