join : "File 2 not in sorted order"

I got the same error with Ubuntu 11.04, with sort and join both in version (GNU coreutils) 8.5.

They are clearly incompatible. In fact the sort command seems bugged: there is no difference with or without the -f (--ignore-case) option. When sorting, aaB is always before aBa. Non alphanumeric characters seems also always ignored (abc is before ab-x)

Join seems to expect the opposite... But I have a solution

In fact, this is linked to the collation sequence: using LANG=en_EN sort -k 1,1 <myfile> ... then LANG=en_EN join ... eliminates the message.

Internationalisation is the root of evil... (nobody documents it clearly).


Were you sorting with numbers? I found that zero-padding the column that I was joining on solved this issue for me.

cat file.txt \
     | awk -F"   " '{ $20=sprintf("%06s", $20); print $0}' \
     | sort > readytojoin.txt

If you are sure you properly sorted your input files and their lines can be paired, you can avoid the above error by running join --nocheck-order file1.txt file2.txt

Tags:

Join

Sort