Difference between subroutine , co-routine , function and thread?

Until we get a post from someone who really knows, here's my understanding to the question, FWIW.

A subroutine and a function are essentially the same thing, with one difference: A function returns some sort of value (usually via the stack or CPU register), while a subroutine does not. Whether subroutine or function, it is a block of executable code, having exactly one point of entry. A co-routine is also a block of executable code, and, just like a subroutine, it has one point of entry. However, it also has one or more points of re-entry. More on this later.

Before getting to threads, let's review: A computer program, also known as a process, will generally have its allocation of memory organized into a code space, a heap, and a stack. The code space stores the one or more blocks of its executable code. The stack stores the parameters, automatic variables, and return addresses of subroutines, functions, and co-routines (and other things too). The heap is the wide-open memory space available to the process for whatever its purposes. In addition to these memory spaces are the CPU registers, each of which stores a set of bits. These bits could be an integer value, a memory address, a bunch of status flags, or whatever. Most programmers don't need to know much about them, but they're there and essential to the operation of the CPU. Probably the ones worth knowing about are the Program Counter, Stack Pointer, and Status Register, but we're not going to get into them here.

A thread is a single logical flow of execution. In a primitive computing system, there is only one thread available to a process. In modern computing systems, a process is composed of one or more threads. Each thread gets its own stack and set of CPU registers (which is usually physically impossible, but made virtual logically - a detail we'll skip on here). However, while each thread of a process has its own stack and registers, they will all share the same heap and code space. They are also (presumably) running simultaneously; something that can truly happen in a multi-core CPU. So two or more parts of your program can run at the same time.

Back to the co-routine: As mentioned before, it has one or more points of re-entry. A point of re-entry means that the co-routine can allow for some other block of code outside of itself to have some execution time, and then at some future time have execution time resume back within its own block of code. This implies that the parameters and automatic variables of the co-routine are preserved (and restored if need be) whenever execution is yielded to an external block of code and then returns to that of the co-routine. A co-routine is something that is not directly implemented in every programming language, although it is common to many assembly languages. In any case, it is possible to implement a co-routine in a conceptual way. There is a good article on co-routines at http://en.wikipedia.org/wiki/Coroutine.

It seems to me there are two principal motivations in implementing a co-routine design pattern: (1) overcoming the limitations of a single-threaded process; and (2) hoping to achieve better computational performance. Motivation (1) is clear to understand when the process must address many things at once where a single thread is a must. Motivation (2) may not be as clear to understand, since that is tied to a lot of particulars about the system hardware, compiler design, and language design. I can only imagine that computational effort might be reduced by cutting-back on stack manipulations, avoidance of redoing intializations in a subroutine, or relieving some of the overhead of maintaining a multi-threaded process.

HTH


I'd like to expand on existing answers, adding following:
There exist 4 main concepts in an invocation of a piece of code:

  1. creation of a context (aka "frame", a local environment for this code to operate)
  2. resuming/invocation of a context (transferring control to that frame, aka jumping)
  3. detaching (resuming another context, similar to 2.)
  4. and destruction of a context

In subroutines creation and resuming happen simultaneously with a "call" instruction - a stack frame gets allocated, arguments and return address are pushed, the execution jumps to the piece of called code. Also, detaching (resuming a caller) and destruction are done simultaneously with a "return" instruction - stack frame is de-allocated, control is transferred to a caller (via previously supplied return address) and the caller is left to scavage the junk on a stack to pick out the return value (depending on your calling conventions).
In coroutines, these main concepts exist independently, decoupled from each other. You create a coroutine at some time, then you can transfer control to it later (so it can yield you some results, possibly multiple times), and then you can destroy it at some later point in time.


Focusing on coroutine vs subroutine:

Coroutines can yield and this is interesting.

Yield 'remembers' where the co-routine is so when it is called again it will continue where it left off.

For example:

  coroutine foo {
    yield 1;
    yield 2;
    yield 3;
  }
  print foo();
  print foo();
  print foo();

Prints: 1 2 3

Note: Coroutines may use a return, and behave just like a subroutine

  coroutine foo {
    return 1;
    return 2; //Dead code
    return 3;
  }
  print foo();
  print foo();
  print foo();

Prints: 1 1 1

Tags:

Terminology