Grep lines starting with 1, but not 10, 11, 100 etc

The question in the body

Select lines that start with a 1 and are followed by an space

grep -c '^1\s'          file
grep -c '^1[[:space:]]' file

That will also give the count of lines (without needing the call to wc)

The question in the title

A 1 not followed by another number (or nothing):

grep -cE '^1([^0-9]|$)' file

But both solutions above have some interesting issues, keep reading.

In the body of the question the user claim that the file is "tab delimited".

Delimiter

tab

A line starting with a 1 followed by a tab (an actual tab in the command). This fails if the delimiter is an space (or any other, or none):

grep '^1    ' file

space

A line starting with a 1 followed by a space (an actual space in the command). This fails if the delimiter is any other or none.:

grep '^1 ' file

tab or space

grep '^1(   | )' file
grep '^1[[:blank:]]' file

whitespace

A more flexible option is to include several space (horizontal and vertical) characters. The [:space:] character class set is composed of (space), \t (horizontal tab), \r (carriage return),\n(newline), \v (vertical tab) and \f (form feed). But grep can not match a newline (it is an internal limitation that could only be avoided with the -z option). It is possible to use it as a description on the delimiter. It is also possible, and shorter, to use the GNU available shorthand of \s:

grep -c '^1[[:space:]]` file
grep -c '^1\s'          file

But this option will fail if the delimiter is something like a colon : or any other punctuation character (or any letter).

Boundary

Or, we can use the transition from a digit to a "not a digit" boundary, well, actually "a character not in [_[:alnum:]] (_a-zA-Z0-9)":

grep -c  '^1\b' file       # portable but not POSIX.
grep -c  '^1\>' file       # portable but not POSIX.
grep -wc '^1'   file       # portable but not POSIX.
grep -c  '^1\W' file       # portable but not POSIX (not match only a `1`) (not underscore in BSD).

This will accept as valid lines that start with a 1 and are followed by some punctuation character.

Sounds like you just want this:

$ grep '^1\b' a
1        TGCAG.....
1        TGCAG......

For the counting portion of this:

$ grep -c '^1\b' file
2

With awk:

awk '$1 == "1" { print; x++ } END { print x, "total matches" }' inputfile

Grep lines starting with 1, but not 10, 11, 100 etc

The question in the body

The question in the title

Delimiter

tab

space

tab or space

whitespace

Boundary

Tags:

Linux

Grep

Numeric Data

Related

Recent Posts