Why does this multi-threaded code print 6 some of the time?
I think I have figured out the sequence of events leading to this issue:
Thread 1 enters if (_state == 3)
Context switch
Thread 2 enters if (_state == 3)
Thread 2 increments state (state = 4
)
Context switch
Thread 1 reads _state
as 4
Context switch
Thread 2 sets _state = 3
Thread 2 enters if (_state == 3)
Context switch
Thread 1 executes _state = 4 + 1
Context switch
Thread 2 reads _state
as 5
Thread 2 executes _state = 5 + 1
;
This is a typical race condition. EDIT: In fact, there are multiple race conditions.
It can happen at any time where _state
is 3 and both threads reach just past the if
statement, either concurrently through context switching in a single core, or simultaneously in parallel in multiple cores.
This is because the ++
operator first reads _state
and then increments it. It's possible that one got hold up enough time after the first if
statement that it'll read 5 or even 6.
EDIT: If you'd generalize this example for N threads, you may observe a number as high as 3 + N+1.
This can be right when the threads start to run, or when one has just set _state
to 3.
To avoid this, use a lock around the if
statement, or use Interlocked
to access _state
, such as if (System.Threading.Interlocked.CompareAndExchange(ref _state, 3, 4) == 3)
and System.Threading.Interlocked.Exchange(ref _state, 3)
.
If you want to keep the race condition, you should declare _state
as volatile
, otherwise you risk each thread seeing _state
locally without updates from other threads.
In alternative, you may use System.Threading.Volatile.Read
and System.Threading.Volatile.Write
, in case you switch your implementation to have _state
as a variable and Tr
as a closure that captures that variable, as local variables can't be (and won't be able to be) declared volatile
. In this case, even initialization must be done with a volatile write.
EDIT: Perhaps the race conditions are more apparent if we change the code slightly by expanding every read:
// Without some sort of memory barrier (volatile, lock, Interlocked.*),
// a thread is allowed to see _state as if other threads hadn't touched it
private static volatile int _state = 3;
// ...
for (int i = 0; i < 10000000; i++)
{
int currentState;
currentState = _state;
if (currentState == 3)
{
// RACE CONDITION: re-read the variable
currentState = _state;
currentState = currentState + 1:
// RACE CONDITION: non-atomic read-modify-write
_state = currentState;
currentState = _state;
if (currentState != 4)
{
// RACE CONDITION: re-read the variable
currentState = _state;
Console.Write(currentState);
}
_state = 3;
}
}
I added comments in places where _state
may be different than assumed by previous variable read statements.
Here's a long diagram, which shows it's even possible to print 6 twice in a row, once in each thread, like the image that the op posted. Remember, threads may not run in-synch, usually due to preemptive context-switching, cache stalls, or core speed differences (due to power saving or temporary turbo speed):
This one is similar to the original, but it uses the Volatile
class, where state
is now a variable captured by a closure. The amount and order of volatile accesses becomes obvious:
static void Main(string[] args)
{
int state = 3;
ThreadStart tr = () =>
{
for (int i = 0; i < 10000000; i++)
{
if (Volatile.Read(ref state) == 3)
{
Volatile.Write(ref state, Volatile.Read(state) + 1);
if (Volatile.Read(ref state) != 4)
{
Console.Write(Volatile.Read(ref state));
}
Volatile.Write(ref state, 3);
}
}
};
Thread firstThread = new Thread(tr);
Thread secondThread = new Thread(tr);
firstThread.Start();
secondThread.Start();
firstThread.Join();
secondThread.Join();
Console.ReadLine();
}
Some thread-safe approaches:
private static object _lockObject;
// ...
// Do not allow concurrency, blocking
for (int i = 0; i < 10000000; i++)
{
lock (_lockObject)
{
// original code
}
}
// Do not allow concurrency, non-blocking
for (int i = 0; i < 10000000; i++)
{
bool lockTaken = false;
try
{
Monitor.TryEnter(_lockObject, ref lockTaken);
if (lockTaken)
{
// original code
}
}
finally
{
if (lockTaken) Monitor.Exit(_lockObject);
}
}
// Do not allow concurrency, non-blocking
for (int i = 0; i < 10000000; i++)
{
// Only one thread at a time will succeed in exchanging the value
try
{
int previousState = Interlocked.CompareExchange(ref _state, 4, 3);
if (previousState == 3)
{
// Allow race condition on purpose (for no reason)
int currentState = Interlocked.CompareExchange(ref _state, 0, 0);
if (currentState != 4)
{
// This branch is never taken
Console.Write(currentState);
}
}
}
finally
{
Interlocked.CompareExchange(ref _state, 3, 4);
}
}
// Allow concurrency
for (int i = 0; i < 10000000; i++)
{
// All threads increment the value
int currentState = Interlocked.Increment(ref _state);
if (currentState == 4)
{
// But still, only one thread at a time enters this branch
// Allow race condition on purpose (it may actually happen here)
currentState = Interlocked.CompareExchange(ref _state, 0, 0);
if (currentState != 4)
{
// This branch might be taken with a maximum value of 3 + N
Console.Write(currentState);
}
}
Interlocked.Decrement(ref _state);
}
This one is a bit different, it takes the last known value of _state
after the increment to perform something:
// Allow concurrency
for (int i = 0; i < 10000000; i++)
{
// All threads increment the value
int currentState = Interlocked.Increment(ref _state);
if (currentState != 4)
{
// Only the thread that incremented 3 will not take the branch
// This can happen indefinitely after the first increment for N > 1
// This branch might be taken with a maximum value of 3 + N
Console.Write(currentState);
}
Interlocked.Decrement(ref _state);
}
Note that the Interlocked.Increment
/Interlocked.Decrement
examples are not safe, unlike the lock
/Monitor
and Interlocked.CompareExchange
examples, as there's no reliable way of knowing if the increment was successful or not.
One common approach is to increment, then follow with a try
/finally
where you decrement in the finally
block. However, an asynchronous exception might be thrown (e.g. ThreadAbortException
)
Asynchronous exceptions can be thrown in unexpected locations, possibly every machine instruction: ThreadAbortException, StackOverflowException, and OutOfMemoryException.
Another approach is to initialize currentState
to something below 3 and conditionally decrement in a finally
block. But again, in between Interlocked.Increment
returning and currentState
being assigned to the result, an asynchronous exception might occur, so currentState
could still have the initial value even though the Interlocked.Increment
succeeded.