How do ssh remote command line arguments get parsed
There is always a remote shell. In the SSH protocol, the client sends the server a string to execute. The SSH command line client takes its command line arguments and concatenates them with a space between the arguments. The server takes that string, runs the user's login shell and passes it that string. (More precisely: the server runs the program that is registered as the user's shell in the user database, passing it two command line arguments: -c
and the string sent by the client. The shell is not invoked as a login shell: the server does not set the zeroth argument to a string beginning with -
.)
It is impossible to bypass the remote shell. The protocol doesn't have anything like sending an array of strings that could be parsed as an argv array on the server. And the SSH server will not bypass the remote shell because that could be a security restriction: using a restricted program as the user's shell is a way to provide a restricted account that is only allowed to run certain commands (e.g. an rsync-only account or a git-only account).
You may not see the shell in pstree
because it may be already gone. Many shells have an optimization where if they detect that they are about to do “run this external command, wait for it to complete, and exit with the command's status”, then the shell runs “execve
of this external command” instead. This is what's happening in your first example. Contrast the following three commands:
ssh otherhost pstree -a -p
ssh otherhost 'pstree -a -p'
ssh otherhost 'pstree -a -p; true'
The first two are identical: the client sends exactly the same data to the server. The third one sends a shell command which defeats the shell's exec optimization.
I think I figured it out:
$ ssh otherhost pstree -a -p -s '$$'
init,1
`-sshd,3736
`-sshd,11998
`-sshd,12000
`-pstree,12001 -a -p -s 12001
The arguments to pstree
are to: show command line arguments, show pids, and show just parent processes of the given pid. The '$$'
is a special shell variable that bash will replace with its own pid when bash evaluates the command line arguments. It's quoted once to stop it from being interpreted by my local shell. But it's not doubly quoted or escaped to allow it to be interpreted by the remote shell.
As we can see, it is replaced with 12001
so that's the pid of the shell. We can also see from the output: pstree,12001
that the process with a pid of 12001 is pstree itself. So pstree
is the shell?
What I gather is going on there is that bash
is being invoked and it is parsing the command line arguments, but then it invokes exec
to replace itself with the command being run.
It seems that it only does this in the case of a single remote command:
$ ssh otherhost pstree -a -p -s '$$' \; echo hi
init,1
`-sshd,3736
`-sshd,17687
`-sshd,17690
`-bash,17691 -c pstree -a -p -s $$ ; echo hi
`-pstree,17692 -a -p -s 17691
hi
In this case, I'm requesting two commands be run: pstree
followed by echo
. And we can see here that bash
does in fact show up in the process tree as a parent of pstree
.