grep the exact block of lines (content of file1) from file2

grep is pretty stupid when it comes to multiline patterns, but translating all newline characters \n of both the pattern and the text to search into NUL characters \0 before comparing them fixes this. Translating \0 in the output back to \n is obviously also needed.

Here's your command, assuming that file1 contains the pattern you want to search in file2:

grep -aof <(tr '\n' '\0' < file1) <(tr '\n' '\0' < file2) | tr '\0' '\n'

Example output for your given files:

A B
C D
E F
G H

Explanation:

<(tr '\n' '\0' < file1) creates a FIFO/named pipe/temporary file-like object that equals file1, but with all newline characters translated to NUL characters.
<(tr '\n' '\0' < file2) does the same, but for file2.
grep -f PATTERN_FILE INPUT_FILE searches for the pattern(s) from PATTERN_FILE in INPUT_FILE.
The -a flag of grep enables matching on binary files. This is needed because otherwise it would skip files that contain non-printable characters like \0.
The -o flag of grep makes it print only the matching sequence, not the whole line where it has been found.
| tr '\0' '\n' translates all NUL characters from the output of the command on the left side back to newline characters.

The following is clumsy, but works with GNU awk:

awk -v RS="$(<file1)" '{print RT}' file2

Just for fun in pure bash

mapfile -t <file1
while read line ; do
    [ "$line" = "${MAPFILE[i++]}" ] || { ["$line" = "$MAPFILE" ] && i=1 || i=0; }
    [ $i -eq ${#MAPFILE[*]} ] && { printf "%s\n" "${MAPFILE[@]}"; i=0; }
done <file2

grep the exact block of lines (content of file1) from file2

Tags:

Awk

Sed

Text Processing

Shell Script

Related

Recent Posts