Why does argv include the program name?
To begin with, note that argv[0]
is not necessarily the program name. It is what the caller puts into argv[0]
of the execve
system call (e.g. see this question on Stack Overflow). (All other variants of exec
are not system calls but interfaces to execve
.)
Suppose, for instance, the following (using execl
):
execl("/var/tmp/mybackdoor", "top", NULL);
/var/tmp/mybackdoor
is what is executed but argv[0]
is set to top
, and this is what ps
or (the real) top
would display. See this answer on U&L SE for more on this.
Setting all of this aside: Before the advent of fancy filesystems like /proc
, argv[0]
was the only way for a process to learn about its own name. What would that be good for?
- Several programs customize their behavior depending on the name by which they were called (usually by symbolic or hard links, for example BusyBox's utilities; several more examples are provided in other answers to this question).
- Moreover, services, daemons and other programs that log through syslog often prepend their name to the log entries; without this, event tracking would become next to infeasible.
Plenty:
- Bash runs in POSIX mode when
argv[0]
issh
. It runs as a login shell whenargv[0]
begins with-
. - Vim behaves differently when run as
vi
,view
,evim
,eview
,ex
,vimdiff
, etc. - Busybox, as already mentioned.
- In systems with systemd as init,
shutdown
,reboot
, etc. are symlinks tosystemctl
. - and so on.
Historically, argv
is just an array of pointers to the "words" of the commandline, so it makes sense to start with the first "word", which happens to be the name of the program.
And there's quite a few programs that behave differently according to which name is used to call them, so you can just create different links to them and get different "commands". The most extreme example I can think of is busybox, which acts like several dozen different "commands" depending on how it is called.
Edit: References for Unix 1st edition, as requested
One can see e.g. from the main function of cc
that argc
and argv
were already used. The shell copies arguments to the parbuf
inside the newarg
part of the loop, while treating the command itself in the same way as the arguments. (Of course, later on it executes only the first argument, which is the name of the command). It looks like execv
and relatives didn't exist then.