Manipulate some poorly delimited data into a useful CSV
A way to do it is to put everything in a hash.
# put values into a hash based on the id and tag
awk 'NR>1{n[$2","$4]+=$1}
END{
# merge the same ids on the one line
for(i in n){
id=i;
sub(/,.*/,"",id);
a[id]=a[id]","n[i];
}
# print everyhing
for(i in a){
print i""a[i];
}
}'
edit: my first answer didn't answer the question properly
Perl to the rescue:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
<>; # Skip the header.
my %sum;
my %types;
while (<>) {
my ($count, $id, $type) = grep length, split '[\s|]+';
$sum{$id}{$type} += $count;
$types{$type} = 1;
}
say join ',', 'id', sort keys %types;
for my $id (sort { $a <=> $b } keys %sum) {
say join ',', $id, map $_ // q(), @{ $sum{$id} }{ sort keys %types };
}
It keeps two tables, table of types and table of ids. For each id, it stores the sum per type.
If GNU datamash is an option for you, then
awk 'NR>1 {print $1, $2, $4}' OFS=, file | datamash -t, -s --filler=0 crosstab 2,3 sum 1
,1,2,3
10,0,0,588
12,0,0,10
14,0,0,883
17,0,0,98
18,17,0,77598
2,0,0,17892
21,0,0,10000
23,0,0,20000
27,0,0,63
3,0,0,6
35,0,0,2446
4,15,253,19871
5,0,0,1000