Filter lines that contain a fixed number of pattern occurrences

$ grep 'foo' file | grep -v 'foo.*foo'

First pick out all lines containing foo, then remove all lines with foo followed by another foo somewhere on the line.

If all lines contain at least one foo (as in your example), you may skip the first grep.

For a general solution to "How do I grep for exactly N occurrences of a string?": grep for lines with at least N matches, then remove lines with N+1 matches (or more).

For the general case - print only lines with exactly N occurrences you could use awk's gsub() which returns the no. of substitutions made and print the line if that no. matches the requirement e.g. to print lines with exactly three occurrences:

 awk '{l=$0;t=gsub(/foo/,"",l)}t==3' infile

Another way with sed:

sed 's/foo/&/3   
t x
: k
d
: x
s/foo/&/4
t k' infile

This attempts to replace the 3rd occurrence with itself, if it fails the line is deleted; if successful it branches to : x where attempts to replace the 4th occurrence with itself - if successful (it means there are more than 3 occurrences) it branches to : k (so that line is also deleted) else it does nothing (besides auto-printing the line...)

For the particular case in your example (lines with only one occurrence) you could also use

sed '/foo/!d;/foo.*foo/d' infile

or something like:

pcregrep '^(?:(?!foo).)*foo((?:(?!foo).)*)$' infile

Using grep -c to count:

while read line; do [[ $(echo $line | sed 's/ /\n/g' | grep -c foo) == 2 ]] && echo "$line"; done < file.txt

Filter lines that contain a fixed number of pattern occurrences

Tags:

Grep

Text Processing

Related

Recent Posts