What is lock-free multithreaded programming?

The key in lock-free programming is to use hardware-intrinsic atomic operations.

As a matter of fact, even locks themselves must use those atomic operations!

But the difference between locked and lock-free programming is that a lock-free program can never be stalled entirely by any single thread. By contrast, if in a locking program one thread acquires a lock and then gets suspended indefinitely, the entire program is blocked and cannot make progress. By contrast, a lock-free program can make progress even if individual threads are suspended indefinitely.

Here's a simple example: A concurrent counter increment. We present two versions which are both "thread-safe", i.e. which can be called multiple times concurrently. First the locked version:

int counter = 0;
std::mutex counter_mutex;

void increment_with_lock()
{
    std::lock_guard<std::mutex> _(counter_mutex);
    ++counter;
}

Now the lock-free version:

std::atomic<int> counter(0);

void increment_lockfree()
{
    ++counter;
}

Now imagine hundreds of thread all call the increment_* function concurrently. In the locked version, no thread can make progress until the lock-holding thread unlocks the mutex. By contrast, in the lock-free version, all threads can make progress. If a thread is held up, it just won't do its share of the work, but everyone else gets to get on with their work.

It is worth noting that in general lock-free programming trades throughput and mean latency throughput for predictable latency. That is, a lock-free program will usually get less done than a corresponding locking program if there is not too much contention (since atomic operations are slow and affect a lot of the rest of the system), but it guarantees to never produce unpredictably large latencies.


For locks, the idea is that you acquire a lock and then do your work knowing that nobody else can interfere, then release the lock.

For "lock-free", the idea is that you do your work somewhere else and then attempt to atomically commit this work to "visible state", and retry if you fail.

The problems with "lock-free" are that:

  • it's hard to design a lock-free algorithm for something that isn't trivial. This is because there's only so many ways to do the "atomically commit" part (often relying on an atomic "compare and swap" that replaces a pointer with a different pointer).
  • if there's contention, it performs worse than locks because you're repeatedly doing work that gets discarded/retried
  • it's virtually impossible to design a lock-free algorithm that is both correct and "fair". This means that (under contention) some tasks can be lucky (and repeatedly commit their work and make progress) and some can be very unlucky (and repeatedly fail and retry).

The combination of these things mean that it's only good for relatively simple things under low contention.

Researchers have designed things like lock-free linked lists (and FIFO/FILO queues) and some lock-free trees. I don't think there's anything more complex than those. For how these things work, because it's hard it's complicated. The most sane approach would be to determine what type of data structure you're interested in, then search the web for relevant research into lock-free algorithms for that data structure.

Also note that there is something called "block free", which is like lock-free except that you know you can always commit the work and never need to retry. It's even harder to design a block-free algorithm, but contention doesn't matter so the other 2 problems with lock-free disappear. Note: the "concurrent counter" example in Kerrek SB's answer is not lock free at all, but is actually block free.