Cartesian product of two files (as sets of lines) in GNU/Linux

There won't be a comma to separate but using only join:

$ join -j 2 file1 file2
 a c
 a d
 a e
 b c
 b d
 b e

The mechanical way to do it in shell, not using Perl or Python, is:

while read line1
do
    while read line2
    do echo "$line1, $line2"
    done < file2
done < file1

The join command can sometimes be used for these operations - however, I'm not clear that it can do cartesian product as a degenerate case.

One step up from the double loop would be:

while read line1
do
    sed "s/^/$line1, /" file2
done < file1

Here's shell script to do it

while read a; do while read b; do echo "$a, $b"; done < file2; done < file1

Though that will be quite slow. I can't think of any precompiled logic to accomplish this. The next step for speed would be to do the above in awk/perl.

awk 'NR==FNR { a[$0]; next } { for (i in a) print i",", $0 }' file1 file2

Hmm, how about this hacky solution to use precompiled logic?

paste -d, <(sed -n "$(yes 'p;' | head -n $(wc -l < file2))" file1) \
          <(cat $(yes 'file2' | head -n $(wc -l < file1)))

Tags:

Linux

Shell