forking() and CreateProcess()

They do different things, and on different systems. CreateProcess is a Windows-only function, while fork is only on POSIX (e.g. Linux and Mac OSX) systems.

The fork system call creates a new process and continue execution in both the parent and the child from the point where the fork function was called. CreateProcess creates a new process and load a program from disk. The only similarity is that the end result is a new process is created.

For more information, read the respective manual page on CreateProcess and fork.


CreateProcess takes the following steps:

  • Create and initialize the process control block (PCB) in the kernel.
  • Create and initialize a new address space.
  • Load the program prog into the address space.
  • Copy arguments args into memory in the address space.
  • Initialize the hardware context to start execution at “start”.
  • Inform the scheduler that the new process is ready to run.

Unix's fork takes the following steps:

  • Create and initialize the process control block (PCB) in the kernel
  • Create a new address space
  • Initialize the address space with a copy of the entire contents of the address space of the parent
  • Inherit the execution context of the parent (e.g., any open files)
  • Inform the scheduler that the new process is ready to run

It creates a complete copy of the parent process, and the parent doesn't set up the runtime environment for the child, because the parent process trusts its own set up. The child is a complete copy of the parent except for its process id (what fork returns). A forked process continues to run the same program as its parent until it performs an explicit exec. When the child calls exec which, the new executable image into memory and runs.

How is it efficient to make a complete copy? copy-on-write. It really only copies the virtual memory map. All of the segments in the segment table are read only. If the parent or child edits the data in a segment, an exception is thrown and the kernel creates a full memory copy of that. This is explained nicely in this answer

There are several benefits to shared resources between parent and child: - intuitively, resource management: less memory is needed to maintain the states of the processes - Cache resources are shared means greater temporal locality of data when data is not over written, which improves performance because retrieving data from larger caches/disk is time consuming.

Disadvantages to shared resources: - when writes are common, it puts the data in an invalid state for the other process, and this leads to coherency misses which is costly if the child process is running on a separate core, because the changes will have to propagate up to the L3 cache.

In general though, programs read a heck of a lot more than writes, typically the child/parent will only need to make writes to its stack, and that's a small portion of the their program block.

Additionally Unix fork is different because it returns twice, once in the parent (the process id of its child), once in the child (0, congrats you're a new baby process), which is how we distinguish in our code if we are the child or parent.

Unix Exec does the following:

  • Load the program prog into the current address space.
  • Copy arguments args into memory in the address space.
  • Initialize the hardware context to start execution at “start.”

the parent has the option to wait for the child to finish. When the child finishes, when exit is called is when the parent's wait is notified.


I will give two examples to show the difference:
fork():

#include "stdio.h"  
#include "stdlib.h"
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int fac(int);
int main(void)  
{
    int child_ret,input_num=-1;
    pid_t pid1;
    while(input_num<0){
        printf("\nPlease input a non-negative number:  ");
        scanf("%d",&input_num);
    }
    if((pid1=fork())<0){
        printf("fork error");
    }
    else if(pid1==0){
        printf("\nI am the child process,my PID is %d.\n\nThe first %d numbers of fibonacci sequence is:\n", getpid(),input_num);
        for (int i=0;i<input_num;i++)
        {printf("%d\n", fac(i+1));}
    }
    else{
        wait(&child_ret);
        printf("\nI am the parent process,my PID is %d.\n\n", getpid());
    }
    return 0;
}
int fac(int n)
{
    if (n<=2) return n-1;
    else 
    {
        return fac(n-1)+fac(n-2);
    }
}

In this program, fork will do a copy and return two values. We called the copied process parent process and the other one child process.If we call the exec() function, the whole process will be replaced by a new program except the PID.

CreateProcess():

#include <windows.h>
#include <stdio.h>
#include <tchar.h>

void _tmain( VOID )
{
    STARTUPINFO si;
    PROCESS_INFORMATION pi;
    LPTSTR szCmdline=_tcsdup(TEXT("MyChildProcess"));

    ZeroMemory( &si, sizeof(si) );
    si.cb = sizeof(si);
    ZeroMemory( &pi, sizeof(pi) );

    // Start the child process.
    if( !CreateProcess( NULL,   // No module name (use command line)
       szCmdline,      // Command line
       NULL,           // Process handle not inheritable
       NULL,           // Thread handle not inheritable
       FALSE,          // Set handle inheritance to FALSE
       0,              // No creation flags
       NULL,           // Use parent's environment block
       NULL,           // Use parent's starting directory
       &si,            // Pointer to STARTUPINFO structure
       &pi )           // Pointer to PROCESS_INFORMATION structure
       )
    {
       printf( "CreateProcess failed (%d)./n", GetLastError() );
       return;
    }

    // Wait until child process exits.
    WaitForSingleObject( pi.hProcess, INFINITE );

    // Close process and thread handles.
    CloseHandle( pi.hProcess );
    CloseHandle( pi.hThread );
}

This is an example from MSDN. What we call to create a new process must be a separate *.exe program in Windows system. The new process is a whole new one which just has the only connection of return value with the older one.
In conclusion, we often see fork()+exec() as CreateProcess(). In fact, fork() is more similar with CreateThread() in Windows.