How to delete input field in AWK?
Deleting fields in awk is notoriously difficult. It seems to be such a simple (and often required) operation but it's harder than it should be.
See Is there a way to completely delete fields in awk, so that extra delimiters do not print? from Stack Overflow for a good way to do this.
I've copied the rmcol()
function in @ghoti's answer, so that we have a copy here on U&L:
function rmcol(col, i) {
for (i=col; i<NF; i++) {
$i=$(i+1)
}
NF--
}
It deletes the specified column from the current input line and decrements the field counter (NF
) to match.
I have no idea what your transform()
function does, so I won't even attempt to duplicate that - but here's an example of using rmcol()
in an awk
one-liner:
$ echo 'field1,field2,field3' | awk -F, -v OFS=, '
function rmcol(col, i) {
for (i=col; i<NF; i++) {
$i=$(i+1)
}
NF--
}
{ rmcol(2); print; }
'
field1,field3
BTW, if you need to delete multiple fields from an input line, it is best/easiest to delete them in reverse order. That is, delete the highest-numbered fields first. Why? Because the higher-numbered fields will be renumbered every time you delete a lower-numbered field, making it very difficult to keep track of which field number belongs to which field.
BTW, delete()
in awk
is for deleting elements of an array - not for deleting fields from an input line. You could split()
each input line (on FS
) into an array and delete the 2nd array element, but then you'd have to write a join()
function to print the array with a comma (or OFS
) separating each field.
Even doing that would be more complicated than one would expect because all arrays in awk
are associative arrays (i.e. they're not numerically indexed) - so delete(array[2])
won't automatically shift array elements 3+ into elements 2+. You'd have to write your own wrapper function around delete()
to do pretty much the same thing for arrays that rmcol()
does for input fields.
Some alternatives
1) pre-process the input to remove the field first, easy to do with cut
if field separator is single character
$ s='field1,field2,field3'
$ # use 'cut -d, -f1,3-' if --complement option is not available
$ echo "$s" | cut -d, --complement -f2
field1,field3
$ echo "$s" | cut -d, --complement -f2 | awk 'BEGIN{FS=OFS=","} {$1="new"} 1'
new,field3
2) use perl
$ # indexing starts from 0, the array @F contains the input fields
$ # $#F will give index of last element in the array
$ echo "$s" | perl -F, -lane '$F[0]="new"; print join ",", @F[0,2..$#F]'
new,field3