Trim leading and trailing spaces from a string in awk

I just came across this. The correct answer is:

awk 'BEGIN{FS=OFS=","} {gsub(/^[[:space:]]+|[[:space:]]+$/,"",$2)} 1'

Warning by @Geoff: see my note below, only one of the suggestions in this answer works (though on both columns).

I would use sed:

sed 's/, /,/' input.txt

This will remove on leading space after the , . Output:

Name,Order
Trim,working
cat,cat1

More general might be the following, it will remove possibly multiple spaces and/or tabs after the ,:

sed 's/,[ \t]\?/,/g' input.txt

It will also work with more than two columns because of the global modifier /g


@Floris asked in discussion for a solution that removes trailing and and ending whitespaces in each colum (even the first and last) while not removing white spaces in the middle of a column:

sed 's/[ \t]\?,[ \t]\?/,/g; s/^[ \t]\+//g; s/[ \t]\+$//g' input.txt

*EDIT by @Geoff, I've appended the input file name to this one, and now it only removes all leading & trailing spaces (though from both columns). The other suggestions within this answer don't work. But try: " Multiple spaces , and 2 spaces before here " *


IMO sed is the optimal tool for this job. However, here comes a solution with awk because you've asked for that:

awk -F', ' '{printf "%s,%s\n", $1, $2}' input.txt

Another simple solution that comes in mind to remove all whitespaces is tr -d:

cat input.txt | tr -d ' '

remove leading and trailing white space in 2nd column

awk 'BEGIN{FS=OFS=","}{gsub(/^[ \t]+/,"",$2);gsub(/[ \t]+$/,"",$2)}1' input.txt

another way by one gsub:

awk 'BEGIN{FS=OFS=","} {gsub(/^[ \t]+|[ \t]+$/, "", $2)}1' infile

If you want to trim all spaces, only in lines that have a comma, and use awk, then the following will work for you:

awk -F, '/,/{gsub(/ /, "", $0); print} ' input.txt

If you only want to remove spaces in the second column, change the expression to

awk -F, '/,/{gsub(/ /, "", $2); print$1","$2} ' input.txt

Note that gsub substitutes the character in // with the second expression, in the variable that is the third parameter - and does so in-place - in other words, when it's done, the $0 (or $2) has been modified.

Full explanation:

-F,            use comma as field separator 
               (so the thing before the first comma is $1, etc)
/,/            operate only on lines with a comma 
               (this means empty lines are skipped)
gsub(a,b,c)    match the regular expression a, replace it with b, 
               and do all this with the contents of c
print$1","$2   print the contents of field 1, a comma, then field 2
input.txt      use input.txt as the source of lines to process

EDIT I want to point out that @BMW's solution is better, as it actually trims only leading and trailing spaces with two successive gsub commands. Whilst giving credit I will give an explanation of how it works.

gsub(/^[ \t]+/,"",$2);    - starting at the beginning (^) replace all (+ = zero or more, greedy)
                             consecutive tabs and spaces with an empty string
gsub(/[ \t]+$/,"",$2)}    - do the same, but now for all space up to the end of string ($)
1                         - ="true". Shorthand for "use default action", which is print $0
                          - that is, print the entire (modified) line

Tags:

Unix

Shell

Awk