join : "File 2 not in sorted order"
I got the same error with Ubuntu 11.04, with sort
and join
both in version (GNU coreutils) 8.5.
They are clearly incompatible. In fact the sort
command seems bugged: there is no difference with or without the -f
(--ignore-case
) option. When sorting, aaB
is always before aBa
. Non alphanumeric characters seems also always ignored (abc
is before ab-x
)
Join seems to expect the opposite... But I have a solution
In fact, this is linked to the collation sequence: using LANG=en_EN sort -k 1,1 <myfile> ...
then LANG=en_EN join ...
eliminates the message.
Internationalisation is the root of evil... (nobody documents it clearly).
Were you sorting with numbers? I found that zero-padding the column that I was joining on solved this issue for me.
cat file.txt \
| awk -F" " '{ $20=sprintf("%06s", $20); print $0}' \
| sort > readytojoin.txt
If you are sure you properly sorted your input files and their lines can be paired, you can avoid the above error by running join --nocheck-order file1.txt file2.txt