Why do processes spawned by cron end up defunct?

to my opinion it's caused by process CROND (spawned by crond for every task) waiting for input on stdin which is piped to the stdout/stderr of the command in the crontab. This is done because cron is able to send resulting output via mail to the user.

So CROND is waiting for EOF till the user command and all it's spawned child processes have closed the pipe. If this is done CROND continues with the wait-statement and then the defunct user command disappears.

So I think you have to explicitly disconnect every spawned subprocess in your script form the pipe (e.g. by redirecting it to a file or /dev/null.

so the following line should work in crontab :

* * * * * ( /tmp/launcher.sh /tmp/tester.sh &>/dev/null & ) 

Because they haven't been the subject of a wait(2) system call.

Since someone may wait for these processes in the future, the kernel can't completely get rid of them or it won't be able to execute the wait system call because it won't have the exit status or evidence of its existence any more.

When you start one from the shell, your shell is trapping SIGCHLD and doing various wait operations anyway, so nothing stays defunct for long.

But cron isn't in a wait state, it is sleeping, so the defunct child may stick around for a while until cron wakes up.


Update:   Responding to comment... Hmm. I did manage to duplicate the issue:

 PPID   PID  PGID  SESS COMMAND
    1  3562  3562  3562 cron
 3562  1629  3562  3562  \_ cron
 1629  1636  1636  1636      \_ sh <defunct>
    1  1639  1636  1636 sleep

So, what happened was, I think:

  • cron forks and cron child starts shell
  • shell (1636) starts sid and pgid 1636 and starts sleep
  • shell exits, SIGCHLD sent to cron 3562
  • signal is ignored or mishandled
  • shell turns zombie. Note that sleep is reparented to init, so when the sleep exits init will get the signal and clean up. I'm still trying to figure out when the zombie gets reaped. Probably with no active children cron 1629 figures out it can exit, at that point the zombie will be reparented to init and get reaped. So now we wonder about the missing SIGCHLD that cron should have processed.
    • It isn't necessarily vixie cron's fault. As you can see here, libdaemon installs a SIGCHLD handler during daemon_fork(), and this could interfere with signal delivery on a quick exit by intermediate 1629

      Now, I don't even know if vixie cron on my Ubuntu system is even built with libdaemon, but at least I have a new theory. :-)


I’d recommend that you solve the problem by simply not having two separate processes: Have launcher.sh do this on its last line:

exec "$@"

This will eliminate the superfluous process.


I suspect that cron is waiting for all subprocesses in the session to terminate. See wait(2) with respect to negative pid arguments. You can see the SESS with:

ps faxo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm

Here's what I see (edited):

STAT  EUID  RUID TT       TPGID  SESS  PGRP  PPID   PID %CPU COMMAND
Ss       0     0 ?           -1  3197  3197     1  3197  0.0 cron
S        0     0 ?           -1  3197  3197  3197 18825  0.0  \_ cron
Zs    1000  1000 ?           -1 18832 18832 18825 18832  0.0      \_ sh <defunct>
S     1000  1000 ?           -1 18832 18832     1 18836  0.0 sleep

Notice that the sh and the sleep are in the same SESS.

Use the command setsid(1). Here's tester.sh:

#!/bin/bash
setsid sleep 27 # the real script launches a compiled C program in the background

Notice you don't need &, setsid puts it in the background.