What happens when a thread forks?

The new process will be the child of the main thread that created the thread. I think.

fork creates a new process. The parent of a process is another process, not a thread. So the parent of the new process is the old process.

Note that the child process will only have one thread because fork only duplicates the (stack for the) thread that calls fork. (This is not entirely true: the entire memory is duplicated, but the child process will only have one active thread.)

If its parent finishes first, the new process will be attached to init process.

If the parent finishes first a SIGHUP signal is sent to the child. If the child does not exit as a result of the SIGHUP it will get init as its new parent. See also the man pages for nohup and signal(7) for a bit more information on SIGHUP.

And its parent is main thread, not the thread that created it.

The parent of a process is a process, not a specific thread, so it is not meaningful to say that the main or child thread is the parent. The entire process is the parent.

One final note: Mixing threads and fork must be done with care. Some of the pitfalls are discussed here.


Correct me if I am wrong.

Will do :)

As fork() is a POSIX system call, its behavior is well defined:

A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html

A forked child is an exact duplicate of its parent, yet only the thread that called fork() in the parent, still exists in the child and is the new main thread of that child until you call exec().

The POSIX description "shall be created with a single thread" is misleading as in fact most implementation will really create an exact duplicate of the parent process, so all other threads and their memory are duplicated as well, which means the threads are in fact there, they just cannot run anymore as the system never assigns any CPU time to them; they are in fact missing in the kernel thread scheduler table.

An easier mental image is the following:

When the parent calls fork, the entire process is frozen for a moment, atomically duplicated, and then the parent is unfrozen as a whole, yet in the child only the one thread that called fork is unfrozen, everything else stays frozen.

That's why it isn't safe to perform certain system calls in between fork() and exec() as also pointed out by the POSIX standard. Ideally you shouldn't do much more than maybe closing or duplicating file descriptors, setting or restoring signal handlers and then calling exec().


However, what will happen if a thread creates a new process using fork()?

A new process will be created by copying the calling thread's address space (not the entire address space of the process). It's generally considered a bad idea because it's very hard to get it right. POSIX says the child process (created in a multi-threaded program) can only call async-signal-safe functions until it calls one of the exec* functions.

If its parent finishes first, the new process will be attached to init process.

The child process is typically inherited by the init process. If the parent process is a controlling process (e.g. shell), then POSIX requires:

If the process is a controlling process, the SIGHUP signal shall be sent to each process in the foreground process group of the controlling terminal belonging to the calling process.

However, this is not true for most processes as most processes aren't controlling processes.

And its parent is main thread, not the thread that created it.

The parent of forked child will always be the process that called fork(). So, PPID is the child process will be the PID of your program.


problem stems from the behaviour of fork(2) itself. Whenever a new child process is created with fork(2) the new process gets a new memory address space but everything in memory is copied from the old process (with copy-on-write that’s not 100% true, but the semantics are the same).

If we call fork(2) in a multi-threaded environment the thread doing the call is now the main-thread in the new process and all the other threads, which ran in the parent process, are dead. And everything they did was left exactly as it was just before the call to fork(2).

Now imagine that these other threads were happily doing their work before the call to fork(2) and a couple of milliseconds later they are dead. What if something these now-dead threads did was not meant to be left exactly as it was?

Let me give you an example. Let’s say our main thread (the one which is going to call fork(2)) was sleeping while we had lots of other threads happily doing some work. Allocating memory, writing to it, copying from it, writing to files, writing to a database and so on. They were probably allocating memory with something like malloc(3). Well, it turns out that malloc(3) uses a mutex internally to guarantee thread-safety. And exactly this is the problem.

What if one of these threads was using malloc(3) and has acquired the lock of the mutex in the exact same moment that the main-thread called fork(2)? In the new child process the lock is still held - by a now-dead thread, who will never return it.

The new child process will have no idea if it’s safe to use malloc(3) or not. In the worst case it will call malloc(3) and block until it acquires the lock, which will never happen, since the thread who’s supposed to return it is dead. And this is just malloc(3). Think about all the other possible mutexes and locks in database drivers, file handling libraries, networking libraries and so on.

for full explanation you can go through this link.