When to call fork() and exec() by themselves?
Sure! A common pattern in "wrapper" programs is to do various things and then replace itself with some other program with only an exec
call (no fork)
#!/bin/sh
export BLAH_API_KEY=blub
...
exec /the/thus/wrapped/program "$@"
A real-life example of this is GIT_SSH
(though git(1)
does also offer GIT_SSH_COMMAND
if you do not want to do the above wrapper program method).
Fork-only is used when spawning a bunch of typically worker processes (e.g. Apache httpd
in fork mode (though fork-only better suits processes that need to burn up the CPU and not those that twiddle their thumbs waiting for network I/O to happen)) or for privilege separation used by sshd
and other programs on OpenBSD (no exec)
$ doas pkg_add pstree
...
$ pstree | grep sshd
|-+= 70995 root /usr/sbin/sshd
| \-+= 28571 root sshd: jhqdoe [priv] (sshd)
| \-+- 14625 jhqdoe sshd: jhqdoe@ttyp6 (sshd)
The root
sshd has on client connect forked off a copy of itself (28571) and then another copy (14625) for the privilege separation.
There are plenty.
Programs that call fork()
without exec()
are usually following a pattern of spawning child worker processes for performing various tasks in separate processes to the main one. You'll find this in programs as varied as dhclient
, php-fpm
, and urxvtd
.
A program that calls exec()
without fork()
is chain loading, overlaying its process with a different program image. There is a whole subculture of chain loading utilities that do particular things to process state and then execute another program to run with that revised process state. Such utilities are common in the daemontools family of service and system management toolsets, but are not limited to those. A few examples:
chpst
from Gerrit Pape's runits6-softlimit
ands6-envdir
from Laurent Bercot's s6local-reaper
andmove-to-control-group
from my nosh toolsetrdprio
andidprio
on FreeBSDnumactl
on FreeBSD and Linux
The daemontools family toolsets have a lot of such tools, from machineenv
through find-matching-jvm
to runtool
.
In addition to other answers, debuggers, using ptrace
, typically make use of the gap between fork
and exec
. A debuggee should mark itself with PTRACE_TRACEME
to indicate that it is being traced by its parent process - the debugger. This is to give required permissions to the debugger.
So a debugger would first fork itself. The child would call ptrace
with PTRACE_TRACEME
and then call exec
. Whichever program the child exec's will now be traceable by the parent.