Shell valid function name characters
Since POSIX documentation allow it as an extension, there's nothing prevent implementation from that behavior.
A simple check (ran in zsh
):
$ for shell in /bin/*sh 'busybox sh'; do
printf '[%s]\n' $shell
$=shell -c 'á() { :; }'
done
[/bin/ash]
/bin/ash: 1: Syntax error: Bad function name
[/bin/bash]
[/bin/dash]
/bin/dash: 1: Syntax error: Bad function name
[/bin/ksh]
[/bin/lksh]
[/bin/mksh]
[/bin/pdksh]
[/bin/posh]
/bin/posh: á: invalid function name
[/bin/yash]
[/bin/zsh]
[busybox sh]
sh: syntax error: bad function name
show that bash
, zsh
, yash
, ksh93
(which ksh
linked to in my system), pdksh
and its derivation allow multi-bytes characters as function name.
yash
is designed to support multibyte characters from the beginning, so there's no surprise it worked.
The other documentation you can refer is ksh93
:
A blank is a tab or a space. An identifier is a sequence of letters, digits, or underscores starting with a letter or underscore. Identifiers are used as components of variable names. A vname is a sequence of one or more identifiers separated by a . and optionally preceded by a .. Vnames are used as function and variable names. A word is a sequence of characters from the character set defined by the current locale, excluding non-quoted metacharacters.
So setting to C
locale:
$ export LC_ALL=C
$ á() { echo 1; }
ksh: á: invalid function name
make it failed.
Note that functions share the same namespace as other commands including commands in the file system, which on most systems have no limitation on the characters or even bytes they may contain in their path.
So while most shells restrict the characters of their functions, there's no real good reason why they would do that. That means in those shells, there are commands you can't replace with a function.
zsh
and rc
allow anything for their function names including some with /
and the empty string. zsh
even allows NUL bytes.
$ zsh
$ $'\0'() echo nul
$ ^@
nul
$ ""() uname
$ ''
Linux
$ /bin/ls() echo test
$ /bin/ls
test
A simple command in shell is a list of arguments, and the first argument is used to derive the command to execute. So, it's just logical that those arguments and function names share the same possible values and in zsh
arguments to builtins and functions can be any byte sequence.
There's not security issue here as the functions you (the script author) define are the ones you invoke.
Where there may be security issues is when the parsing is affected by the environment, for instance with shells where the valid names for functions is affected by the locale.