Why does popen() invoke a shell to execute a process?
The 2004 version of the POSIX system()
documentation has a rationale that is likely applicable to popen()
as well. Note the stated restrictions on system()
, especially the one stating "that the process ID is different":
RATIONALE
...
There are three levels of specification for the system() function. The ISO C standard gives the most basic. It requires that the function exists, and defines a way for an application to query whether a command language interpreter exists. It says nothing about the command language or the environment in which the command is interpreted.
IEEE Std 1003.1-2001 places additional restrictions on system(). It requires that if there is a command language interpreter, the environment must be as specified by fork() and exec. This ensures, for example, that close-on- exec works, that file locks are not inherited, and that the process ID is different. It also specifies the return value from system() when the command line can be run, thus giving the application some information about the command's completion status.
Finally, IEEE Std 1003.1-2001 requires the command to be interpreted as in the shell command language defined in the Shell and Utilities volume of IEEE Std 1003.1-2001.
Note the multiple references to the "ISO C Standard". The latest version of the C standard requires that the command string be processed by the system's "command processor":
7.22.4.8 The
system
functionSynopsis
#include <stdlib.h> int system(const char *string);
Description
If
string
is a null pointer, thesystem
function determines whether the host environment has a command processor. Ifstring
is not a null pointer, thesystem
function passes the string pointed to bystring
to that command processor to be executed in a manner which the implementation shall document; this might then cause the program callingsystem
to behave in a non-conforming manner or to terminate.Returns
If the argument is a null pointer, the
system
function returns nonzero only if a command processor is available. If the argument is not a null pointer, and thesystem
function does return, it returns an implementation-defined value.
Since the C standard requires that the systems "command processor" be used for the system()
call, I suspect that:
- Somewhere there's a requirement in POSIX that ties
popen()
to thesystem()
implementation. - It's much easier to just reuse the "command processor" entirely since there's also a requirement to run as a separate process.
So this is the glib answer twice-removed.
Invoking a shell allows you to do all the things that you can do in a shell. For example,
FILE *fp = popen("ls *", "r");
is possible with popen()
(expands all files in the current directory).
Compare it with:
execvp("/bin/ls", (char *[]){"/bin/ls", "*", NULL});
You can't exec ls
with *
as argument because exec(2)
will interpret *
literally.
Similarly, pipes (|
), redirection (>
, <
, ...), etc., are possible with popen
.
Otherwise, there's no reason to use popen
if you don't need shell - it's unnecessary. You'll end up with an extra shell process and all the things that can go wrong in a shell go can wrong in your program (e.g., the command you pass could be incorrectly interpreted by the shell and a common security issue). popen()
is designed that way. fork
+ exec
solution is cleaner without the issues associated with a shell.
The glib answer is because the The POSIX standard ( http://pubs.opengroup.org/onlinepubs/9699919799/functions/popen.html ) says so. Or rather, it says that it should behave as if the command argument is passed to /bin/sh for interpretation.
So I suppose a conforming implementation could, in principle, also have some internal library function that would interpret shell commands without having to fork and exec a separate shell process. I'm not actually aware of any such implementation, and I suspect getting all the corner cases correct would be pretty tricky.