Using awk to sum the values of a column, based on the values of another column
awk -F '|' '$1 ~ /smiths/ {sum += $3} END {print sum}' inputfilename
- The
-F
flag sets the field separator; I put it in single quotes because it is a special shell character. - Then
$1 ~ /smiths/
applies the following {code block} only to lines where the first field matches the regex/smiths/
. - The rest is the same as your code.
Note that since you're not really using a regex here, just a specific value, you could just as easily use:
awk -F '|' '$1 == "smiths" {sum += $3} END {print sum}' inputfilename
Which checks string equality. This is equivalent to using the regex /^smiths$/
, as mentioned in another answer, which includes the ^
anchor to only match the start of the string (the start of field 1) and the $
anchor to only match the end of the string. Not sure how familiar you are with regexes. They are very powerful, but for this case you could use a string equality check just as easily.
Another approach is to use awk associative arrays, more info here. This line produces the desired output:
awk -F '|' '{a[$1] += $3} END{print a["smiths"]}' filename.txt
As a side effect, the array stores all other values:
awk -F '|' '{a[$1] += $3} END{for (i in a) print i, a[i]}' filename.txt
Output:
smiths 212
denniss 100
olivert 10
Very good so far. All you need to do is add a selector before the block to add the sum. Here we check that the first argument contains only "smiths":
awk 'BEGIN {FS = "|"} ; $1 ~ /^smiths$/ {sum+=$3} END {print sum}'
You could shorten this a bit by specifying the field separator as an option. In awk
it's generally a good idea to initialize variables on the command line:
awk -F'|' '$1 ~ /^smiths$/ {sum+=$3} END {print sum}'