Python multiprocessing crashes docker container
mp.py
doesn't look like an equivalent of forever.py
. mp.py
will run new worker process, which will just print something
and then it will exit => join()
in the main process will exit immediately, when this worker process is done.
Better equivalent of forever.py
: worker process prints hello message in the infinite loop and main process will be waiting for this worker process exit in join()
- forever-mp.py
:
import multiprocessing as mp
from time import sleep
def do_smth():
i = 0
while True:
sleep(1.0)
i += 1
print(f'hello {i:3}')
if __name__ == '__main__':
ctx = mp.get_context("spawn")
p = ctx.Process(target=do_smth, args=tuple())
p.start()
p.join()
Updated docker-compose.yml
:
version: '3.6'
services:
bug:
build:
context: .
environment:
- PYTHONUNBUFFERED=1
command: su -c "python3.6 forever-mp.py"
Test:
$ docker-compose build && docker-compose up
...
some output
...
Attaching to multiprcs_bug_1_72681117a752
bug_1_72681117a752 | hello 1
bug_1_72681117a752 | hello 2
bug_1_72681117a752 | hello 3
bug_1_72681117a752 | hello 4
Check processes in the container:
$ docker top multiprcs_bug_1_72681117a752
UID PID PPID C STIME TTY TIME CMD
root 38235 38217 0 21:36 ? 00:00:00 su -c python3.6 forever-mp.py
root 38297 38235 0 21:36 ? 00:00:00 python3.6 forever-mp.py
root 38300 38297 0 21:36 ? 00:00:00 /usr/local/bin/python3.6 -c from multiprocessing.semaphore_tracker import main;main(3)
root 38301 38297 0 21:36 ? 00:00:00 /usr/local/bin/python3.6 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=4, pipe_handle=6) --multiprocessing-fork
for a quick fix, do not use spawn
start method, and/or do not use su -c ...
, both are unnecessary IMO. change to:
p = mp.Process(target=do_smth, args=tuple())
or you could start container with --init
option.
with spawn
start method, Python will also start a semaphore tracker process to prevent semaphore leaking, you could see this process by pausing mp.py
in the middle, it looks like:
472 463 /usr/local/bin/python3 -c from multiprocessing.semaphore_tracker import main;main(3)
this process is started by mp.py
but exited after mp.py
, thus it will not be reaped by mp.py
, but is supposed to be reaped by init
by design.
the problem is there is no init
in this container(namespace), instead of init
, PID 1 is su -c
, therefore the dead semaphore tracker process is adopted by su
.
it appears that su
consider the dead child process is the command process(forever.py
) mistakenly, without checking the relationship, so su
exit blindly, as PID 1 exit, kernel kills all other processes in the container, including forever.py
.
this behavior could be observed with strace
:
docker run --security-opt seccomp:unconfined --rm -it ex_bug strace -e trace=process -f su -c 'python3 forever.py'
will output error message like:
strace: Exit of unknown pid 14 ignored
ref: Docker and the PID 1 zombie reaping problem (phusion.nl)