How to know if a text file is a subset of another

If those file contents are called file1, file2 and file3 in order of apearance then you can do it with the following one-liner:

 # python -c "x=open('file1').read(); y=open('file2').read(); print x in y or y in x"
 True
 # python -c "x=open('file2').read(); y=open('file1').read(); print x in y or y in x"
 True
 # python -c "x=open('file1').read(); y=open('file3').read(); print x in y or y in x"
 False

With perl:

if perl -0777 -e '$n = <>; $h = <>; exit(index($h,$n)<0)' needle.txt haystack.txt
then echo needle.txt is found in haystack.txt
fi

-0octal defines the record delimiter. When that octal number is greater than 0377 (the maximum byte value), that means there's no delimiter, it's equivalent to doing $/ = undef. In that case, <> returns the full content of a single file, that's the slurp mode.

Once we have the content of the files in two $h and $n variables, we can use index() to determine if one is found in the other.

That means however that the whole files are stored in memory which means that method won't work for very large files.

For mmappable files (usually includes regular files and most seekable files like block devices), that can be worked around by using mmap() on the files, like with the Sys::Mmap perl module:

if 
  perl -MSys::Mmap -le '
    open N, "<", $ARGV[0] || die "$ARGV[0]: $!";
    open H, "<", $ARGV[1] || die "$ARGV[1]: $!";
    mmap($n, 0, PROT_READ, MAP_SHARED, N);
    mmap($h, 0, PROT_READ, MAP_SHARED, H);
    exit (index($h, $n) < 0)' needle.txt haystack.txt
then
  echo needle.txt is found in haystack.txt
fi

I found a solution thanks to this question

Basically I am testing two files a.txt and b.txt with this script:

#!/bin/bash

first_cmp=$(diff --unchanged-line-format= --old-line-format= --new-line-format='%L' "$1" "$2" | wc -l)
second_cmp=$(diff --unchanged-line-format= --old-line-format= --new-line-format='%L' "$2" "$1" | wc -l)

if [ "$first_cmp" -eq "0" -o "$second_cmp" -eq "0" ]
then
    echo "Subset"
    exit 0
else
    echo "Not subset"
    exit 1
fi

If one is subset of the other the script return 0 for True otherwise 1.