Difference between a "coroutine" and a "thread"?

First read: Concurrency vs Parallelism - What is the difference?

Concurrency is the separation of tasks to provide interleaved execution. Parallelism is the simultaneous execution of multiple pieces of work in order to increase speed. —https://github.com/servo/servo/wiki/Design

Short answer: With threads, the operating system switches running threads preemptively according to its scheduler, which is an algorithm in the operating system kernel. With coroutines, the programmer and programming language determine when to switch coroutines; in other words, tasks are cooperatively multitasked by pausing and resuming functions at set points, typically (but not necessarily) within a single thread.

Long answer: In contrast to threads, which are pre-emptively scheduled by the operating system, coroutine switches are cooperative, meaning the programmer (and possibly the programming language and its runtime) controls when a switch will happen.

In contrast to threads, which are pre-emptive, coroutine switches are cooperative (programmer controls when a switch will happen). The kernel is not involved in the coroutine switches. —http://www.boost.org/doc/libs/1_55_0/libs/coroutine/doc/html/coroutine/overview.html

A language that supports native threads can execute its threads (user threads) onto the operating system's threads (kernel threads). Every process has at least one kernel thread. Kernel threads are like processes, except that they share memory space in their owning process with all other threads in that process. A process "owns" all its assigned resources, like memory, file handles, sockets, device handles, etc., and these resources are all shared among its kernel threads.

The operating system scheduler is part of the kernel that runs each thread for a certain amount time (on a single processor machine). The scheduler allocates time (timeslicing) to each thread, and if the thread isn't finished within that time, the scheduler pre-empts it (interrupts it and switches to another thread). Multiple threads can run in parallel on a multi-processor machine, as each thread can be (but doesn't necessarily have to be) scheduled onto a separate processor.

On a single processor machine, threads are timesliced and preempted (switched between) quickly (on Linux the default timeslice is 100ms) which makes them concurrent. However, they can't be run in parallel (simultaneously), since a single-core processor can only run one thing at a time.

Coroutines and/or generators can be used to implement cooperative functions. Instead of being run on kernel threads and scheduled by the operating system, they run in a single thread until they yield or finish, yielding to other functions as determined by the programmer. Languages with generators, such as Python and ECMAScript 6, can be used to build coroutines. Async/await (seen in C#, Python, ECMAscript 7, Rust) is an abstraction built on top of generator functions that yield futures/promises.

In some contexts, coroutines may refer to stackful functions while generators may refer to stackless functions.

Fibers, lightweight threads, and green threads are other names for coroutines or coroutine-like things. They may sometimes look (typically on purpose) more like operating system threads in the programming language, but they do not run in parallel like real threads and work instead like coroutines. (There may be more specific technical particularities or differences among these concepts depending on the language or implementation.)

For example, Java had "green threads"; these were threads that were scheduled by the Java virtual machine (JVM) instead of natively on the underlying operating system's kernel threads. These did not run in parallel or take advantage of multiple processors/cores--since that would require a native thread! Since they were not scheduled by the OS, they were more like coroutines than kernel threads. Green threads are what Java used until native threads were introduced into Java 1.2.

Threads consume resources. In the JVM, each thread has its own stack, typically 1MB in size. 64k is the least amount of stack space allowed per thread in the JVM. The thread stack size can be configured on the command line for the JVM. Despite the name, threads are not free, due to their use resources like each thread needing its own stack, thread-local storage (if any), and the cost of thread scheduling/context-switching/CPU cache invalidation. This is part of the reason why coroutines have become popular for performance critical, highly-concurrent applications.

Mac OS will only allow a process to allocate about 2000 threads, and Linux allocates 8MB stack per thread and will only allow as many threads that will fit in physical RAM.

Hence, threads are the heaviest weight (in terms of memory usage and context-switching time), then coroutines, and finally generators are the lightest weight.


Coroutines are a form of sequential processing: only one is executing at any given time (just like subroutines AKA procedures AKA functions -- they just pass the baton among each other more fluidly).

Threads are (at least conceptually) a form of concurrent processing: multiple threads may be executing at any given time. (Traditionally, on single-CPU, single-core machines, that concurrency was simulated with some help from the OS -- nowadays, since so many machines are multi-CPU and/or multi-core, threads will de facto be executing simultaneously, not just "conceptually").


About 7 years late, but the answers here are missing some context on co-routines vs threads. Why are coroutines receiving so much attention lately, and when would I use them compared to threads?

First of all if coroutines run concurrently (never in parallel), why would anyone prefer them over threads?

The answer is that coroutines can provide a very high level of concurrency with very little overhead. Generally in a threaded environment you have at most 30-50 threads before the amount of overhead wasted actually scheduling these threads (by the system scheduler) significantly cuts into the amount of time the threads actually do useful work.

Ok so with threads you can have parallelism, but not too much parallelism, isn't that still better than a co-routine running in a single thread? Well not necessarily. Remember a co-routine can still do concurrency without scheduler overhead - it simply manages the context-switching itself.

For example if you have a routine doing some work and it performs an operation you know will block for some time (i.e. a network request), with a co-routine you can immediately switch to another routine without the overhead of including the system scheduler in this decision - yes you the programmer must specify when co-routines can switch.

With a lot of routines doing very small bits of work and voluntarily switching between each other, you've reached a level of efficiency no scheduler could ever hope to achieve. You can now have thousands of coroutines working together as opposed to tens of threads.

Because your routines now switch between each other a pre-determined points you can now also avoid locking on shared data structures (because you would never tell your code to switch to another coroutine in the middle of a critical section)

Another benefit is the much lower memory usage. With the threaded-model, each thread needs to allocate its own stack, and so your memory usage grows linearly with the number of threads you have. With co-routines, the number of routines you have doesn't have a direct relationship with your memory usage.

And finally, co-routines are receiving a lot of attention because in some programming languages (such as Python) your threads cannot run in parallel anyway - they run concurrently just like coroutines, but without the low memory and free scheduling overhead.