What characters are required to be escaped in command line arguments?
The following characters have special meaning to the shell itself in some contexts and may need to be escaped in arguments:
Character | Unicode | Name | Usage |
---|---|---|---|
` |
U+0060 (Grave Accent) | Backtick | Command substitution |
~ |
U+007E | Tilde | Tilde expansion |
! |
U+0021 | Exclamation mark | History expansion |
# |
U+0023 Number sign | Hash | Comments |
$ |
U+0024 | Dollar sign | Parameter expansion |
& |
U+0026 | Ampersand | Background commands |
* |
U+002A | Asterisk | Filename expansion and globbing |
( |
U+0028 | Left Parenthesis | Subshells |
) |
U+0029 | Right Parenthesis | Subshells |
|
U+0009 | Tab (⇥ ) |
Word splitting (whitespace) |
{ |
U+007B Left Curly Bracket | Left brace | Brace expansion |
[ |
U+005B | Left Square Bracket | Filename expansion and globbing |
| |
U+007C Vertical Line | Vertical bar | Pipelines |
\ |
U+005C Reverse Solidus | Backslash | Escape character |
; |
U+003B | Semicolon | Separating commands |
' |
U+0027 Apostrophe | Single quote | String quoting |
" |
U+0022 Quotation Mark | Double quote | String quoting with interpolation |
↩ |
U+000A Line Feed | Newline | Line break |
< |
U+003C | Less than | Input redirection |
> |
U+003E | Greater than | Output redirection |
? |
U+003F | Question mark | Filename expansion and globbing |
|
U+0020 | Space | Word splitting1 (whitespace) |
Some of those characters are used for more things and in more places than the one I linked.
There are a few corner cases that are explicitly optional:
!
can be disabled withset +H
, which is the default in non-interactive shells.{
can be disabled withset +B
.*
and?
can be disabled withset -f
orset -o noglob
.=
Equals sign (U+003D) also needs to be escaped ifset -k
orset -o keyword
is enabled.
Escaping a newline requires quoting — backslashes won't do the job. Any other characters listed in IFS will need similar handling. You don't need to escape ]
or }
, but you do need to escape )
because it's an operator.
Some of these characters have tighter limits on when they truly need escaping than others. For example, a#b
is ok, but a #b
is a comment, while >
would need escaping in both contexts. It doesn't hurt to escape them all conservatively anyway, and it's easier than remembering the fine distinctions.
If your command name itself is a shell keyword (if
, for
, do
) then you'll need to escape or quote it too. The only interesting one of those is in
, because it's not obvious that it's always a keyword. You don't need to do that for keywords used in arguments, only when you've (foolishly!) named a command after one of them. Shell operators ((
, &
, etc) always need quoting wherever they are.
1Stéphane has noted that any other single-byte blank character from your locale also needs escaping. In most common, sensible locales, at least those based on C or UTF-8, it's only the whitespace characters above. In some ISO-8859-1 locales, U+00A0 no-break space is considered blank, including Solaris, the BSDs, and OS X (I think incorrectly). If you're dealing with an arbitrary unknown locale, it could include just about anything, including letters, so good luck.
Conceivably, a single byte considered blank could appear within a multi-byte character that wasn't blank, and you'd have no way to escape that other than putting the whole thing in quotes. This isn't a theoretical concern: in an ISO-8859-1 locale from above, that A0
byte which is considered a blank can appear within multibyte characters like UTF-8 encoded "à" (C3 A0
). To handle those characters safely you would need to quote them "à"
. This behaviour depends on the locale configuration in the environment running the script, not the one where you wrote it.
I think this behaviour is broken multiple ways, but we have to play the hand we're dealt. If you're working with any non-self-synchronising multibyte character set, the safest thing would be to quote everything. If you're in UTF-8 or C, you're safe (for the moment).
In GNU Parallel this is tested and used extensively:
$a =~ s/[\002-\011\013-\032\\\#\?\`\(\)\{\}\[\]\^\*\<\=\>\~\|\; \"\!\$\&\'\202-\377]/\\$&/go;
# quote newline as '\n'
$a =~ s/[\n]/'\n'/go;
It is tested in bash
,dash
,ash
,ksh
,zsh
, and fish
. Some of the characters do not need quoting in some (versions) of the shells, but the above works in all tested shells.
If you simply want a string quoted, you can pipe it into parallel --shellquote
:
printf "&*\t*!" | parallel --shellquote