Grep lines starting with 1, but not 10, 11, 100 etc
The question in the body
Select lines that start with a 1
and are followed by an space
grep -c '^1\s' file
grep -c '^1[[:space:]]' file
That will also give the count of lines (without needing the call to wc)
The question in the title
A 1
not followed by another number (or nothing):
grep -cE '^1([^0-9]|$)' file
But both solutions above have some interesting issues, keep reading.
In the body of the question the user claim that the file is "tab delimited".
Delimiter
tab
A line starting with a 1
followed by a tab (an actual tab in the command). This fails if the delimiter is an space (or any other, or none):
grep '^1 ' file
space
A line starting with a 1
followed by a space (an actual space in the command). This fails if the delimiter is any other or none.:
grep '^1 ' file
tab or space
grep '^1( | )' file
grep '^1[[:blank:]]' file
whitespace
A more flexible option is to include several space (horizontal and vertical) characters. The [:space:]
character class set is composed of (space),
\t
(horizontal tab), \r
(carriage return),\n
(newline), \v
(vertical tab) and \f
(form feed). But grep can not match a newline (it is an internal limitation that could only be avoided with the -z
option). It is possible to use it as a description on the delimiter. It is also possible, and shorter, to use the GNU available shorthand of \s
:
grep -c '^1[[:space:]]` file
grep -c '^1\s' file
But this option will fail if the delimiter is something like a colon :
or any other punctuation character (or any letter).
Boundary
Or, we can use the transition from a digit to a "not a digit" boundary, well, actually "a character not in [_[:alnum:]]
(_a-zA-Z0-9
)":
grep -c '^1\b' file # portable but not POSIX.
grep -c '^1\>' file # portable but not POSIX.
grep -wc '^1' file # portable but not POSIX.
grep -c '^1\W' file # portable but not POSIX (not match only a `1`) (not underscore in BSD).
This will accept as valid lines that start with a 1 and are followed by some punctuation character.
Sounds like you just want this:
$ grep '^1\b' a
1 TGCAG.....
1 TGCAG......
For the counting portion of this:
$ grep -c '^1\b' file
2
With awk
:
awk '$1 == "1" { print; x++ } END { print x, "total matches" }' inputfile