How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?
Print lines between PAT1 and PAT2
$ awk '/PAT1/,/PAT2/' file
PAT1
3 - first block
4
PAT2
PAT1
7 - second block
PAT2
PAT1
10 - third block
Or, using variables:
awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' file
How does this work?
/PAT1/
matches lines having this text, as well as/PAT2/
does./PAT1/{flag=1}
sets theflag
when the textPAT1
is found in a line./PAT2/{flag=0}
unsets theflag
when the textPAT2
is found in a line.flag
is a pattern with the default action, which is toprint $0
: ifflag
is equal 1 the line is printed. This way, it will print all those lines occurring from the timePAT1
occurs and up to the nextPAT2
is seen. This will also print the lines from the last match ofPAT1
up to the end of the file.
Print lines between PAT1 and PAT2 - not including PAT1 and PAT2
$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file
3 - first block
4
7 - second block
10 - third block
This uses next
to skip the line that contains PAT1
in order to avoid this being printed.
This call to next
can be dropped by reshuffling the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' file
.
Print lines between PAT1 and PAT2 - including PAT1
$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' file
PAT1
3 - first block
4
PAT1
7 - second block
PAT1
10 - third block
By placing flag
at the very end, it triggers the action that was set on either PAT1 or PAT2: to print on PAT1, not to print on PAT2.
Print lines between PAT1 and PAT2 - including PAT2
$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' file
3 - first block
4
PAT2
7 - second block
PAT2
10 - third block
By placing flag
at the very beginning, it triggers the action that was set previously and hence print the closing pattern but not the starting one.
Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs
This is based on a solution by Ed Morton.
awk 'flag{
if (/PAT2/)
{printf "%s", buf; flag=0; buf=""}
else
buf = buf $0 ORS
}
/PAT1/ {flag=1}' file
As a one-liner:
$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' file
3 - first block
4
7 - second block
# note the lack of third block, since no other PAT2 happens after it
This keeps all the selected lines in a buffer that gets populated from the moment PAT1 is found. Then, it keeps being filled with the following lines until PAT2 is found. In that point, it prints the stored content and empties the buffer.
What about the classic sed
solution?
Print lines between PAT1 and PAT2 - include PAT1 and PAT2
sed -n '/PAT1/,/PAT2/p' FILE
Print lines between PAT1 and PAT2 - exclude PAT1 and PAT2
GNU sedsed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
Any sed1
sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p;};}' FILE
or even (Thanks Sundeep):
GNU sedsed -n '/PAT1/,/PAT2/{//!p}' FILE
Any sed
sed -n '/PAT1/,/PAT2/{//!p;}' FILE
Print lines between PAT1 and PAT2 - include PAT1 but not PAT2
The following includes just the range start:
GNU sedsed -n '/PAT1/,/PAT2/{/PAT2/!p}' FILE
Any sed
sed -n '/PAT1/,/PAT2/{/PAT2/!p;}' FILE
Print lines between PAT1 and PAT2 - include PAT2 but not PAT1
The following includes just the range end:
GNU sedsed -n '/PAT1/,/PAT2/{/PAT1/!p}' FILE
Any sed
sed -n '/PAT1/,/PAT2/{/PAT1/!p;}' FILE
1 Note about BSD/Mac OS X sed
A command like this here:
sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
Would emit an error:
▶ sed -n '/PAT1/,/PAT2/{/PAT1/!{/PAT2/!p}}' FILE
sed: 1: "/PAT1/,/PAT2/{/PAT1/!{/ ...": extra characters at the end of p command
For this reason this answer has been edited to include BSD and GNU versions of the one-liners.
Using grep
with PCRE (where available) to print markers and lines between markers:
$ grep -Pzo "(?s)(PAT1(.*?)(PAT2|\Z))" file
PAT1
3 - first block
4
PAT2
PAT1
7 - second block
PAT2
PAT1
10 - third block
-P
perl-regexp, PCRE. Not in allgrep
variants-z
Treat the input as a set of lines, each terminated by a zero byte instead of a newline-o
print only matching(?s)
DotAll, ie. dot finds newlines as well(.*?)
nongreedy find\Z
Match only at end of string, or before newline at the end
Print lines between markers excluding end marker:
$ grep -Pzo "(?s)(PAT1(.*?)(?=(\nPAT2|\Z)))" file
PAT1
3 - first block
4
PAT1
7 - second block
PAT1
10 - third block
(.*?)(?=(\nPAT2|\Z))
nongreedy find with lookahead for\nPAT2
and\Z
Print lines between markers excluding markers:
$ grep -Pzo "(?s)((?<=PAT1\n)(.*?)(?=(\nPAT2|\Z)))" file
3 - first block
4
7 - second block
10 - third block
(?<=PAT1\n)
positive lookbehind forPAT1\n
Print lines between markers excluding start marker:
$ grep -Pzo "(?s)((?<=PAT1\n)(.*?)(PAT2|\Z))" file
3 - first block
4
PAT2
7 - second block
PAT2
10 - third block