Why does dash expand \\\\ differently to bash?
In
echo "A bug\\'s life"
Because those are double quotes, and \
is special inside double quotes, the first \
is understood by the shell as escaping/quoting the second \
. So a A bug\'s life
argument is being passed to echo
.
echo "A bug\'s life"
Would have achieved exactly the same. '
being not special inside double quotes, the \
is not removed so it's the exact same argument that is passed to echo
.
As explained at Why is printf better than echo?, there's a lot of variation between echo
implementations.
In Unix-conformant implementations like dash
's, \
is used to introduce escape sequences: \n
for newline, \b
for backspace, \0123
for octal sequences... and \\
for backslash itself.
Some (non-POSIX) ones require a -e
option for that, or do it only when in conformance mode (like bash
's when built with the right options like for the sh
of OS/X or when called with SHELLOPTS=xpg_echo
in the environment).
So in standard (Unix standard only; POSIX leaves the behaviour unspecified) echo
s,
echo '\\'
same as:
echo "\\\\"
outputs one backslash, while in bash
when not in conformance mode:
echo '\\'
will output two backslashes.
Best it to avoid echo
and use printf
instead:
$ printf '%s\n' "A bug\'s life"
A bug\'s life
Which works the same in this instance in all printf
implementations.
The issue for echo and printf is only related to understanding when a back quoted character is an "special character".
The most simple is with an string in printf '%s' "$string"
.
In this case there are no special characters to process and everything that the command printf receive in the second argument is printed as-is.
Note that only single quotes are used:
$ printf '%s\n' '\\\\\\\\\T ' # nine \
\\\\\\\\\T # nine \
When the string is used as the first argument, some characters are special.
A \\
pair represents a single \
and a \T
a single T
:
$ printf '\\\\\\\\\T ' # nine \
\\\\T # four \
Each of four pairs of \\
transformed to a single \
and the last \T
to a T
.
$ printf '\\\\\\\\\a ' # nine \
\\\\ # four \
Each of four pairs of \\
transformed to a single \
and the last \a
to a bell (BEL) character (not printable).
The same happens with some implementations of echo.
The dash implementation always transform special backslash characters.
If we place this code in a script:
set -- '\g ' '\\g ' '\\\g ' '\\\\g ' '\\\\\g ' '\\\\\\g ' '\\\\\\\g ' '\\\\\\\\g ' '\\\\\\\\\g '
for i ; do
printf '<%-14s> \t<%-9s> \t<%-14s> \t<%-12s>\n' \
"$(printf '%s ' "|$i|")" \
"$(printf "|$i|")" \
"$(echo "|$i|")" \
"$(echo -e "|$i|")" ;
done
Then, dash will print (dash ./script
):
<|\g | > <|\g | > <|\g | > <-e |\g | >
<|\\g | > <|\g | > <|\g | > <-e |\g | >
<|\\\g | > <|\\g | > <|\\g | > <-e |\\g | >
<|\\\\g | > <|\\g | > <|\\g | > <-e |\\g | >
<|\\\\\g | > <|\\\g | > <|\\\g | > <-e |\\\g | >
<|\\\\\\g | > <|\\\g | > <|\\\g | > <-e |\\\g | >
<|\\\\\\\g | > <|\\\\g | > <|\\\\g | > <-e |\\\\g | >
<|\\\\\\\\g | > <|\\\\g | > <|\\\\g | > <-e |\\\\g | >
<|\\\\\\\\\g | > <|\\\\\g |> <|\\\\\g | > <-e |\\\\\g |>
The first two columns will be the same (printf) for all shells.
The other two will change with the specific implementation of echo used.
For example: ash ./script
(busybox ash):
<|\g | > <|\g | > <|\g | > <|\g | >
<|\\g | > <|\g | > <|\\g | > <|\g | >
<|\\\g | > <|\\g | > <|\\\g | > <|\\g | >
<|\\\\g | > <|\\g | > <|\\\\g | > <|\\g | >
<|\\\\\g | > <|\\\g | > <|\\\\\g | > <|\\\g | >
<|\\\\\\g | > <|\\\g | > <|\\\\\\g | > <|\\\g | >
<|\\\\\\\g | > <|\\\\g | > <|\\\\\\\g | > <|\\\\g | >
<|\\\\\\\\g | > <|\\\\g | > <|\\\\\\\\g | > <|\\\\g | >
<|\\\\\\\\\g | > <|\\\\\g |> <|\\\\\\\\\g | > <|\\\\\g | >
If the character used is an a
, for dash:
<|\a | > <| | > <| | > <-e | | >
<|\\a | > <|\a | > <|\a | > <-e |\a | >
<|\\\a | > <|\ | > <|\ | > <-e |\ | >
<|\\\\a | > <|\\a | > <|\\a | > <-e |\\a | >
<|\\\\\a | > <|\\ | > <|\\ | > <-e |\\ | >
<|\\\\\\a | > <|\\\a | > <|\\\a | > <-e |\\\a | >
<|\\\\\\\a | > <|\\\ | > <|\\\ | > <-e |\\\ | >
<|\\\\\\\\a | > <|\\\\a | > <|\\\\a | > <-e |\\\\a | >
<|\\\\\\\\\a | > <|\\\\ | > <|\\\\ | > <-e |\\\\ | >
And for bash:
<|\a | > <| | > <|\a | > <| | >
<|\\a | > <|\a | > <|\\a | > <|\a | >
<|\\\a | > <|\ | > <|\\\a | > <|\ | >
<|\\\\a | > <|\\a | > <|\\\\a | > <|\\a | >
<|\\\\\a | > <|\\ | > <|\\\\\a | > <|\\ | >
<|\\\\\\a | > <|\\\a | > <|\\\\\\a | > <|\\\a | >
<|\\\\\\\a | > <|\\\ | > <|\\\\\\\a | > <|\\\ | >
<|\\\\\\\\a | > <|\\\\a | > <|\\\\\\\\a | > <|\\\\a | >
<|\\\\\\\\\a | > <|\\\\ | > <|\\\\\\\\\a | > <|\\\\ | >
To that, we have to add the interpretation that the shell were the commands are being executed may also apply to the string of characters.
$ printf '%s\n' '\\\\T '
\\\\T
$ printf '%s\n' "\\\\T "
\\T
Note that the shell take some action on the backslash inside the double quotes.
With this code:
tab=' '
say(){ echo "$(printf '%s' "$a") $tab $(echo "$a") $tab $(echo -e "$a")"; }
a="one \a " ; say
a="two \\a " ; say
a="t33 \\\a " ; say
a="f44 \\\\a " ; say
a="f55 \\\\\a " ; say
a="s66 \\\\\\a " ; say
a="s77 \\\\\\\a " ; say
a="e88 \\\\\\\\a " ; say
a="n99 \\\\\\\\\a " ; say
Both effects get added, and we get this:
$ bash ./script
one \a one \a one
two \a two \a two
t33 \\a t33 \\a t33 \a
f44 \\a f44 \\a f44 \a
f55 \\\a f55 \\\a f55 \
s66 \\\a s66 \\\a s66 \
s77 \\\\a s77 \\\\a s77 \\a
e88 \\\\a e88 \\\\a e88 \\a
n99 \\\\\a n99 \\\\\a n99 \\
For dash it is even more severe:
$ dash ./script
one one -e one
two two -e two
t33 \a t33 -e t33
f44 \a f44 -e f44
f55 \ f55 \ -e f55 \
s66 \ s66 \ -e s66 \
s77 \\a s77 \a -e s77 \a
e88 \\a e88 \a -e e88 \a
n99 \\ n99 \ -e n99 \