Systemd and process spawning: child processes are killed when main process exits
I managed to fix this simply by setting KillMode
to process
instead of control-group
(default). Thanks all!
I have a Python script that forks when it launches, and is responsible for starting a bunch of other processes.
Which indicates that you are doing it wrongly. More in this in a moment.
when the script exits the child processes are orphaned and continue to run.
This is not correct dæmon behaviour. If the "main" process — in this case the child that you have forked, since you have specified Type=forking
— exits, systemd considers the service to have deactivated and terminates any other still-running processes (in the control group) in order to tidy up.
Sometimes the conversion from System 5 rc
scripts to systemd is not straightforward, because the right way to do things under systemd is quite different. The right way to do (say) OpenVPN, or OpenStack, or OSSEC HIDS in systemd is not the same as one would do it with an rc
script. The fact that you have a script that is forking, then spawning a whole load of grandchildren processes, then exiting expecting those grandchildren to keep running indicates that you are perpetrating the same sort of horror as ossec-control
, albeit with two less levels of forking. If you find yourself writing a "master" script that checks "enable" flags and runs child processes for the "enabled" parts of your system, then you are making the same mistake as the horrendous ossec-control
.
No such home-grown mechanisms are necessary with systemd. It already is a service manager. Per https://unix.stackexchange.com/a/200365/5132, the right way to go about this in systemd is not to have one service that spawns some wacky and confused attempt to have "sub-services". It is to have each child process as a fully fledged systemd service in its own right. Then one enables and disables, and starts and stops, the various parts of the system using the normal systemd controls. As you can see in the OSSEC HIDS case, a simple template service unit covers almost all (one exception is at https://askubuntu.com/a/624871/43344) services, allowing one to do things such as systemctl enable [email protected]
to enable an optional agentlessd
service, without any need at all for the horrendous "master script" mechanism that was needed with System 5 rc
.
There are plenty of cases, not perhaps as extreme as OSSEC HIDS, where such rethinking is necessary. MTSes like exim and sendmail are two such. One might have had a single rc
script that spawns a queue runner, an SMTP Submission dæmon, and an SMTP Relay dæmon, with a bunch of ad hoc shell variables in a configuration file to control exactly which are run. But the right way to do this with systemd is to have three proper service units (two of which have associated socket units) and no ad hoc stuff at all, just the regular mechanisms of the service manager.