Use backslash or single-quotes for field separation
That's all to do with your shell, not with awk
.
In Bourne-like shells, \
, '...'
and "..."
are all quoting operators.
Quoting removes the special meaning a character may have in the syntax of the shell. \
quotes a single character (except for newline which it removes instead), '...'
and "..."
can quote more than one (with "..."
not quoting every character).
;
is a special character in the syntax of the shell. It's used to separate commands. You want to quote it if you want to pass it verbatim to a command. \;
, ';'
will do.
";"
will also do as ;
is not one of those characters that are still special within double quotes, but you'd need "\\"
to pass one literal backslash to a command because \
is one of those characters that are still special within "..."
(though it's then only special when followed by other special characters within "..."
like that "
itself).
Again that very much depends on the shell. In the rc
shell for instance, \
and "
are not special let alone quoting characters, -F\;
wouldn't work there as the command would be parsed as both the awk -F\
and ...
command separated with ;
.
See How to use a special character as a normal one? for more details.
To complicates things further, note that the argument to -F
itself also goes through one or two layers of backslash processing by awk.
awk
processes first the argument it receives to expand ANSI C escape sequences in it. If you use awk -F '\t'
or awk -F \\t
or awk -F "\\t"
or awk -F "\t"
, awk
receives an argument that contains \t
, which it expands to a TAB character. The FS
awk variable will contain a TAB character, not \t
.
With awk -F '\\'
, awk
receives a \\
argument and sets FS
to the \
character. Strictly speaking, awk -F '\'
would is unspecified as that escape sequence is unfinished but in practice, except for busybox awk
, all awk
implementations I know treat it the same as awk -F '\\'
.
In awk
, when FS
contains a single character, that character is the field separator. awk -F .
splits the records on dot characters.
However when FS
contains more than one character, it is interpreted as a regular expression. awk -F ..
doesn't spilt on sequences of two dots, but on sequences of any two characters as .
is the regular expression operator that matches any single character. To split on two dots, you'd need awk -F '[.][.]'
or awk -F '\\.\\.'
.
With awk -F '\\\\'
, a literal \\\\
is passed by the shell to awk
, awk
expands each of those two \\
to \
, so FS
becomes \\
, which is treated as a regular expression. \
is also special in the regular expression syntax and is used to remove the special meaning of a character as a regex operator this time. So again, that is splitting on backslash characters, though this time, as a regular expression.
So, in practice, to split on \
, all of these (in Bourne-like shells) will work:
awk -F '\' # FS becomes a single \ except in busybox where it's empty
awk -F "\\" # instead so it's a one-character split on backslash
awk -F \\ # and a one-field-by-character split in busybox
awk -F '\\' # FS becomes a single \ in every awk implementation
awk -F \\\\ # so one-character split on backslash
awk -F "\\\\"
awk -F '\\\' # FS is \ on busybox and \\ in other implementations
awk -F \\\\\\ # so one-character split on backslash in busybox and
awk -F "\\\\\\" # \\ regex split in other implementations, to the same effect
awk -F '\\\\' # FS is \\ in all implementations so
awk -F \\\\\\\\ # \\ regex split
awk -F "\\\\\\\"
I would advise to use single quotes as they are the most straightforward and least surprising kind of quotes. So here, to split on backslash portably: awk -F '\\'
.
You can also do things like:
awk -v FS='\\' ...
Or
awk 'BEGIN{FS="\\"} ...'
or
awk ... 'FS=\\'
or:
FS='\' awk 'BEGIN{FS = ENVIRON["FS"]} ...'
(that one avoiding the extra backslash expansion performed by awk
, so need only one backslash).
All characters within single quotes are treated literally (i.e. no character is special between a pair of single quotes). Without single quotes, you need to backslash-escape a character with a special meaning if you want to use the literal character.
These are the quoting rules of the shell and are unrelated to awk.