How do I start threads in plain C?
AFAIK, ANSI C doesn't define threading, but there are various libraries available.
If you are running on Windows, link to msvcrt and use _beginthread or _beginthreadex.
If you are running on other platforms, check out the pthreads library (I'm sure there are others as well).
C11 threads + C11 atomic_int
Added to glibc 2.28. Tested in Ubuntu 18.10 amd64 (comes with glic 2.28) and Ubuntu 18.04 (comes with glibc 2.27) by compiling glibc 2.28 from source: Multiple glibc libraries on a single host
Example adapted from: https://en.cppreference.com/w/c/language/atomic
main.c
#include <stdio.h>
#include <threads.h>
#include <stdatomic.h>
atomic_int atomic_counter;
int non_atomic_counter;
int mythread(void* thr_data) {
(void)thr_data;
for(int n = 0; n < 1000; ++n) {
++non_atomic_counter;
++atomic_counter;
// for this example, relaxed memory order is sufficient, e.g.
// atomic_fetch_add_explicit(&atomic_counter, 1, memory_order_relaxed);
}
return 0;
}
int main(void) {
thrd_t thr[10];
for(int n = 0; n < 10; ++n)
thrd_create(&thr[n], mythread, NULL);
for(int n = 0; n < 10; ++n)
thrd_join(thr[n], NULL);
printf("atomic %d\n", atomic_counter);
printf("non-atomic %d\n", non_atomic_counter);
}
GitHub upstream.
Compile and run:
gcc -ggdb3 -std=c11 -Wall -Wextra -pedantic -o main.out main.c -pthread
./main.out
Possible output:
atomic 10000
non-atomic 4341
The non-atomic counter is very likely to be smaller than the atomic one due to racy access across threads to the non-atomic variable.
See also: How to do an atomic increment and fetch in C?
Disassembly analysis
Disassemble with:
gdb -batch -ex "disassemble/rs mythread" main.out
contains:
17 ++non_atomic_counter;
0x00000000004007e8 <+8>: 83 05 65 08 20 00 01 addl $0x1,0x200865(%rip) # 0x601054 <non_atomic_counter>
18 __atomic_fetch_add(&atomic_counter, 1, __ATOMIC_SEQ_CST);
0x00000000004007ef <+15>: f0 83 05 61 08 20 00 01 lock addl $0x1,0x200861(%rip) # 0x601058 <atomic_counter>
so we see that the atomic increment is done at the instruction level with the f0
lock prefix.
With aarch64-linux-gnu-gcc
8.2.0, we get instead:
11 ++non_atomic_counter;
0x0000000000000a28 <+24>: 60 00 40 b9 ldr w0, [x3]
0x0000000000000a2c <+28>: 00 04 00 11 add w0, w0, #0x1
0x0000000000000a30 <+32>: 60 00 00 b9 str w0, [x3]
12 ++atomic_counter;
0x0000000000000a34 <+36>: 40 fc 5f 88 ldaxr w0, [x2]
0x0000000000000a38 <+40>: 00 04 00 11 add w0, w0, #0x1
0x0000000000000a3c <+44>: 40 fc 04 88 stlxr w4, w0, [x2]
0x0000000000000a40 <+48>: a4 ff ff 35 cbnz w4, 0xa34 <mythread+36>
so the atomic version actually has a cbnz
loop that runs until the stlxr
store succeed. Note that ARMv8.1 can do all of that with a single LDADD instruction.
This is analogous to what we get with C++ std::atomic
: What exactly is std::atomic?
Benchmark
TODO. Crate a benchmark to show that atomic is slower.
POSIX threads
main.c
#define _XOPEN_SOURCE 700
#include <assert.h>
#include <stdlib.h>
#include <pthread.h>
enum CONSTANTS {
NUM_THREADS = 1000,
NUM_ITERS = 1000
};
int global = 0;
int fail = 0;
pthread_mutex_t main_thread_mutex = PTHREAD_MUTEX_INITIALIZER;
void* main_thread(void *arg) {
int i;
for (i = 0; i < NUM_ITERS; ++i) {
if (!fail)
pthread_mutex_lock(&main_thread_mutex);
global++;
if (!fail)
pthread_mutex_unlock(&main_thread_mutex);
}
return NULL;
}
int main(int argc, char **argv) {
pthread_t threads[NUM_THREADS];
int i;
fail = argc > 1;
for (i = 0; i < NUM_THREADS; ++i)
pthread_create(&threads[i], NULL, main_thread, NULL);
for (i = 0; i < NUM_THREADS; ++i)
pthread_join(threads[i], NULL);
assert(global == NUM_THREADS * NUM_ITERS);
return EXIT_SUCCESS;
}
Compile and run:
gcc -std=c99 -Wall -Wextra -pedantic -o main.out main.c -pthread
./main.out
./main.out 1
The first run works fine, the second fails due to missing synchronization.
There don't seem to be POSIX standardized atomic operations: UNIX Portable Atomic Operations
Tested on Ubuntu 18.04. GitHub upstream.
GCC __atomic_*
built-ins
For those that don't have C11, you can achieve atomic increments with the __atomic_*
GCC extensions.
main.c
#define _XOPEN_SOURCE 700
#include <pthread.h>
#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>
enum Constants {
NUM_THREADS = 1000,
};
int atomic_counter;
int non_atomic_counter;
void* mythread(void *arg) {
(void)arg;
for (int n = 0; n < 1000; ++n) {
++non_atomic_counter;
__atomic_fetch_add(&atomic_counter, 1, __ATOMIC_SEQ_CST);
}
return NULL;
}
int main(void) {
int i;
pthread_t threads[NUM_THREADS];
for (i = 0; i < NUM_THREADS; ++i)
pthread_create(&threads[i], NULL, mythread, NULL);
for (i = 0; i < NUM_THREADS; ++i)
pthread_join(threads[i], NULL);
printf("atomic %d\n", atomic_counter);
printf("non-atomic %d\n", non_atomic_counter);
}
Compile and run:
gcc -ggdb3 -O3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c -pthread
./main.out
Output and generated assembly: the same as the "C11 threads" example.
Tested in Ubuntu 16.04 amd64, GCC 6.4.0.
Since you mentioned fork() I assume you're on a Unix-like system, in which case POSIX threads (usually referred to as pthreads) are what you want to use.
Specifically, pthread_create() is the function you need to create a new thread. Its arguments are:
int pthread_create(pthread_t * thread, pthread_attr_t * attr, void *
(*start_routine)(void *), void * arg);
The first argument is the returned pointer to the thread id. The second argument is the thread arguments, which can be NULL unless you want to start the thread with a specific priority. The third argument is the function executed by the thread. The fourth argument is the single argument passed to the thread function when it is executed.