How to make main thread wait for all child threads finish?

First create all the threads, then join all of them:

pthread_t tid[2];

/// create all threads
for (int i = 0; i < 2; i++) {
    pthread_create(&tid[i], NULL, routine, NULL);
}

/// wait all threads by joining them
for (int i = 0; i < 2; i++) {
    pthread_join(tid[i], NULL);  
}

Alternatively, have some pthread_attr_t variable, use pthread_attr_init(3) then pthread_attr_setdetachedstate(3) on it, then pass its address to pthread_create(3) second argument. Thos would create the threads in detached state. Or use pthread_detach as explained in Jxh's answer.

Remember to read some good Pthread tutorial. You may want to use mutexes and condition variables.

You could use frameworks wrapping them, e.g. Qt or POCO (in C++), or read a good C++ book and use C++ threads.

Conceptually, threads have each their call stack and are related to continuations. They are "heavy".

Consider some agent-oriented programming approach: as a rule of thumb, you don't want to have a lot of threads (e.g. 20 threads on a 10 core processor is reasonable, 200 threads won't be unless a lot of them are sleeping or waiting) and and do want threads to synchronize using mutex and condition variables and communicate and/or synchronize with other threads quite often (several times per second). See also poll(2), fifo(7), unix(7), sem_overview(7) with shm_overview(7) as another way of communicating between threads. In general, avoid using signal(7) with threads (read signal-safety(7)...), and use dlopen(3) with caution (probably only in the main thread).

A pragmatical approach would be to have most of your threads running some event loop (using poll(2), pselect(2), perhaps eventfd(2), signalfd(2), ....), perhaps communicating using pipe(7) or unix(7) sockets. See also socket(7).

Don't forget to document (on paper) the communication protocols between threads. For a theoretical approach, read books about π-calculus and be aware of Rice's theorem : debugging concurrent programs is difficult.


You could start the threads detached, and not worry about joining.

for (int i = 0; i < 2; i++) {
    pthread_t tid;
    pthread_create(&tid, NULL, routine, NULL);
    pthread_detach(tid);
}
pthread_exit(0);

Or, alternatively, you can have the thread that dies report back to the main thread who it is, so that the threads are joined in the order they exited, rather than in the order you created them in.

void *routine(void *arg)
{
    int *fds = (int *)arg;
    pthread_t t = pthread_self();
    usleep((rand()/(1.0 + RAND_MAX)) * 1000000);
    write(fds[1], &t, sizeof(t));
}

int main()
{
    int fds[2];
    srand(time(0));
    pipe(fds);
    for (int i = 0; i < 2; i++) {
        pthread_t tid;
        pthread_create(&tid, NULL, routine, fds);
        printf("created: %llu\n", (unsigned long long)tid);
    }
    for (int i = 0; i < 2; i++) {
        pthread_t tid;
        read(fds[0], &tid, sizeof(tid));
        printf("joining: %llu\n", (unsigned long long)tid);
        pthread_join(tid, 0);
    }
    pthread_exit(0);
}

int main()
{
    pthread_t tid[2];
    for (int i = 0; i < 2; i++) {
        pthread_create(&tid[i], NULL, routine, NULL);
    }
    for (int i = 0; i < 2; i++)
       pthread_join(tid[i], NULL);
    return 0;
}