How do I trim leading and trailing whitespace from each line of some output?
awk '{$1=$1;print}'
or shorter:
awk '{$1=$1};1'
Would trim leading and trailing space or tab characters1 and also squeeze sequences of tabs and spaces into a single space.
That works because when you assign something to one of the fields, awk
rebuilds the whole record (as printed by print
) by joining all fields ($1
, ..., $NF
) with OFS
(space by default).
1(and possibly other blank characters depending on the locale and the awk
implementation)
The command can be condensed like so if you're using GNU sed
:
$ sed 's/^[ \t]*//;s/[ \t]*$//' < file
Example
Here's the above command in action.
$ echo -e " \t blahblah \t " | sed 's/^[ \t]*//;s/[ \t]*$//'
blahblah
You can use hexdump
to confirm that the sed
command is stripping the desired characters correctly.
$ echo -e " \t blahblah \t " | sed 's/^[ \t]*//;s/[ \t]*$//' | hexdump -C
00000000 62 6c 61 68 62 6c 61 68 0a |blahblah.|
00000009
Character classes
You can also use character class names instead of literally listing the sets like this, [ \t]
:
$ sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' < file
Example
$ echo -e " \t blahblah \t " | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//'
Most of the GNU tools that make use of regular expressions (regex) support these classes (here with their equivalent in the typical C locale of an ASCII-based system (and there only)).
[[:alnum:]] - [A-Za-z0-9] Alphanumeric characters
[[:alpha:]] - [A-Za-z] Alphabetic characters
[[:blank:]] - [ \t] Space or tab characters only
[[:cntrl:]] - [\x00-\x1F\x7F] Control characters
[[:digit:]] - [0-9] Numeric characters
[[:graph:]] - [!-~] Printable and visible characters
[[:lower:]] - [a-z] Lower-case alphabetic characters
[[:print:]] - [ -~] Printable (non-Control) characters
[[:punct:]] - [!-/:-@[-`{-~] Punctuation characters
[[:space:]] - [ \t\v\f\n\r] All whitespace chars
[[:upper:]] - [A-Z] Upper-case alphabetic characters
[[:xdigit:]] - [0-9a-fA-F] Hexadecimal digit characters
Using these instead of literal sets always seems like a waste of space, but if you're concerned with your code being portable, or having to deal with alternative character sets (think international), then you'll likely want to use the class names instead.
References
- Section 3 of the sed FAQ
xargs without arguments do that.
Example:
trimmed_string=$(echo "no_trimmed_string" | xargs)