Print only the Nth line before each line that matches a pattern
A buffer of lines needs to be used.
Give a try to this:
awk -v N=4 -v pattern="example.*pattern" '{i=(1+(i%N));if (buffer[i]&& $0 ~ pattern) print buffer[i]; buffer[i]=$0;}' file
Set N
value to the Nth line before the pattern to print.
Set pattern
value to the regex to search.
buffer
is an array of N
elements. It is used to store the lines. Each time the pattern is found, the N
th line before the pattern is printed.
That code doesn't work for previous lines. To get lines before the matched pattern, you need to somehow save the lines already processed. Since awk
only has associative arrays, I can't think of an equally simple way of doing what you want in awk
, so here's a perl solution:
perl -ne 'push @lines,$_; print $lines[0] if /PAT/; shift(@lines) if $.>LIM;' file
Change PAT
to the pattern you want to match and LIM
to the number of lines. For example, to print the 5th line before each occurrence of foo
, you would run:
perl -ne 'push @lines,$_; print $lines[0] if /foo/; shift(@lines) if $.>5;' file
Explanation
perl -ne
: read the input file line by line and apply the script given by-e
to each line.push @lines,$_
: add the current line ($_
) to the array@lines
.print $lines[0] if /PAT/
: print the first element in the array@lines
($lines[0]
) if the current line matches the desired pattern.shift(@lines) if $.>LIM;
:$.
is the current line number. If that is greater than the limit, remove the 1st value from the array@lines
. The result is that@lines
will always have the lastLIM
lines.
tac file | awk 'c&&!--c;/pattern/{c=N}' | tac
But this has the same omission as the 'forwards' use case when there are multiple matches within N lines of each other.
And it won't work so well when the input is piped from a running process, but it's the simplest way when the input file is complete and not growing.