How can I "grep" patterns across multiple lines?
Here's a sed
one that will give you grep
-like behavior across multiple lines:
sed -n '/foo/{:start /bar/!{N;b start};/your_regex/p}' your_file
How it works
-n
suppresses the default behavior of printing every line/foo/{}
instructs it to matchfoo
and do what comes inside the squigglies to the matching lines. Replacefoo
with the starting part of the pattern.:start
is a branching label to help us keep looping until we find the end to our regex./bar/!{}
will execute what's in the squigglies to the lines that don't matchbar
. Replacebar
with the ending part of the pattern.N
appends the next line to the active buffer (sed
calls this the pattern space)b start
will unconditionally branch to thestart
label we created earlier so as to keep appending the next line as long as the pattern space doesn't containbar
./your_regex/p
prints the pattern space if it matchesyour_regex
. You should replaceyour_regex
by the whole expression you want to match across multiple lines.
I generally use a tool called pcregrep
which can be installed in most of the linux flavour using yum
or apt
.
For eg.
Suppose if you have a file named testfile
with content
abc blah
blah blah
def blah
blah blah
You can run the following command:
$ pcregrep -M 'abc.*(\n|.)*def' testfile
to do pattern matching across multiple lines.
Moreover, you can do the same with sed
as well.
$ sed -e '/abc/,/def/!d' testfile
Simply a normal grep which supports Perl-regexp
parameter P
will do this job.
$ echo 'abc blah
blah blah
def blah
blah blah' | grep -oPz '(?s)abc.*?def'
abc blah
blah blah
def
(?s)
called DOTALL modifier which makes dot in your regex to match not only the characters but also the line breaks.