How to use awk to extract data from a file based on the content of another file?
You can set variables like RS separately for each file. For example:
$ awk 'NR==FNR{a[$0];next} $1 in a' RS='\r?\n' title_list.txt RS='+++\r?\n' FS='\r?\n' ORS='+++\n' data.txt
article 1 title
article 1 body line 1
article 1 body line 2
+++
article 3 title
article 3 body line 1
article 3 body line 2
+++
article 4 title
article 4 body line 1
article 4 body line 2
article 4 body line 3
+++
In gawk
you can use special blocks BEGINFILE
and ENDFILE
to set whatever rules you need before/after reading new file, for example:
$ awk 'NR==FNR{a[$0]++;next}ENDFILE{RS="+++\n";FS="\n"}a[$1]{printf $0RT}' title_list.txt data.txt
article 1 title
article 1 body line 1
article 1 body line 2
+++
article 3 title
article 3 body line 1
article 3 body line 2
+++
article 4 title
article 4 body line 1
article 4 body line 2
article 4 body line 3