Remove lines based on pattern but keeping first n lines that match
If you want to delete all lines starting with % put preserving the first two lines of input, you could do:
sed -e 1,2b -e '/^%/d'
Though the same would be more legible with awk
:
awk 'NR <= 2 || !/^%/'
Or, if you're after performance:
{ head -n 2; grep -v '^%'; } < input-file
If you want to preserve the first two lines matching the pattern while they may not be the first ones of the input, awk
would certainly be a better option:
awk '!/^%/ || ++n <= 2'
With sed
, you could use tricks like:
sed -e '/^%/!b' -e 'x;/xx/{h;d;}' -e 's/^/x/;x'
That is, use the hold space to count the number of occurrences of the patterns matched so far. Not terribly efficient or legible.
I'm afraid sed
alone is a bit too simple for this (not that it would be impossible, rather complicated - see e.g. sed sokoban for what can be done).
How about awk
?
#!/bin/awk -f
BEGIN { c = 0; }
{
if (/^%/) {
if (c++ < 3) {
print;
}
} else {
print;
}
}
If you can rely on using recent enough BASH (which supports regular expressions), the awk above can be translated to:
#!/bin/bash -
c=0
while IFS= read -r line; do
if [[ $line =~ ^% ]]; then
if ((c++ < 3)); then
printf '%s\n' "$line"
fi
else
printf '%s\n' "$line"
fi
done
You can also use sed
or grep
to do the pattern matching instead of the =~
operator.
A Perl one-liners solution:
# in-place editing
perl -i -pe '$.>2 && s/^%.*//s' filename.txt
# print to the standard output
perl -ne '$.>2 && /^%/ || print' filename.txt