Non-Toy Software Transactional Memory for C or Java

Production-quality STM-Libraries are not intended as a teaching tool, not even as "best practice". What is worth learning for any college/university-course is maybe 1% of the code; the remaining 99% is nitty-gritty platform-dependent intrinsic corner-cases. The 1% that is interesting is not highlighted in any way so you have no way of finding it.

What I recommend for a college/university-course (no matter if introductory or advanced) is to implement STM-buildingblocks yourself (and only for 1 platform).

Start by introducing the problems: concurrency, cache...

Then introduce the atomic helpers we have: cas/cmpxchg, fence.

Then build examples together with your students, first easy, then harder and more complex.


There are three good ways to do STM today.

The first way is to use gcc and do TM in C or C++. As of gcc 4.7, transactional memory is supported via the -fgnu-tm flag. The gcc maintainers have done a lot of work, and as of the 4.9 (trunk) branch, you can even use hardware TM (e.g., Intel Haswell TSX). There is a draft specification for the interface to the TM at http://justingottschlich.com/tm-specification-for-c-v-1-1/, which is not too painful. You can also find use cases of gcc's TM from the TM community (see, for example, the application track papers from transact 2014: http://transact2014.cse.lehigh.edu).

The implementation itself is a bit complex, but that's what it takes to be correct. There's a lot of literature on the things that can go wrong, especially in a type-unsafe language like C or C++. GCC gets all of these things right. Really.

The second way is to use Java. You can either use DeuceSTM, which is very easy to extend (type safety makes TM implementation much easier!), or use Scala's Akka library for STM. I prefer Deuce, because it's easier to extend and easier to use (you just annotate a method as @Atomic, and Deuce's java agents do the rest).

The third way is to use Scala. I've not done much in this space, but researchers seem to love Akka. If you're affiliated with a parallel/distributed class, you might even be using Scala already.


Start by introducing the problems: concurrency, cache...

Leading on from eznme, some good problems that I covered while at University for concurrency.

  • Dining philosophers problem

    In computer science, the dining philosophers problem is an example problem often used in concurrent algorithm design to illustrate synchronization issues and techniques for resolving them.

    dining phil
    (source: wikimedia.org)

Using the same implementation from here, by Je Magee and Je Kramer, and solving the problem using monitors.

Most shared memory applications are more efficient with Integers than Strings (due to AtomicInteger class for Java). So the best way to demonstrate shared memory in my opinion is to get the students to write an application that uses a threadpool to calculate prime numbers, or to calculate some integral.

Or a good example of threads and shared memory is the Producer-consumer problem.

The producer-consumer problem (also known as the bounded-buffer problem) is a classic example of a multi-process synchronization problem.

producer
(source: csusb.edu)

Implementation found here, there is also an implementation from Massey from the professor in Software Eng Jenz Dietrich.

For distributed algorithms MapReduce and Hadoop are highly documented distributed data structures. And for Distributed Programming Libraries look into MPI (Message Passing Interface) and OpenMP (or Pragma for C++). There is also implementations of Dijkstra shortest path algorithm in parallel too.