The need for volatile modifier in double checked locking in .NET

Volatile is unnecessary. Well, sort of**

volatile is used to create a memory barrier* between reads and writes on the variable.
lock, when used, causes memory barriers to be created around the block inside the lock, in addition to limiting access to the block to one thread.
Memory barriers make it so each thread reads the most current value of the variable (not a local value cached in some register) and that the compiler doesn't reorder statements. Using volatile is unnecessary** because you've already got a lock.

Joseph Albahari explains this stuff way better than I ever could.

And be sure to check out Jon Skeet's guide to implementing the singleton in C#


update:
*volatile causes reads of the variable to be VolatileReads and writes to be VolatileWrites, which on x86 and x64 on CLR, are implemented with a MemoryBarrier. They may be finer grained on other systems.

**my answer is only correct if you are using the CLR on x86 and x64 processors. It might be true in other memory models, like on Mono (and other implementations), Itanium64 and future hardware. This is what Jon is referring to in his article in the "gotchas" for double checked locking.

Doing one of {marking the variable as volatile, reading it with Thread.VolatileRead, or inserting a call to Thread.MemoryBarrier} might be necessary for the code to work properly in a weak memory model situation.

From what I understand, on the CLR (even on IA64), writes are never reordered (writes always have release semantics). However, on IA64, reads may be reordered to come before writes, unless they are marked volatile. Unfortuantely, I do not have access to IA64 hardware to play with, so anything I say about it would be speculation.

i've also found these articles helpful:
http://www.codeproject.com/KB/tips/MemoryBarrier.aspx
vance morrison's article (everything links to this, it talks about double checked locking)
chris brumme's article (everything links to this)
Joe Duffy: Broken Variants of Double Checked Locking

luis abreu's series on multithreading give a nice overview of the concepts too
http://msmvps.com/blogs/luisabreu/archive/2009/06/29/multithreading-load-and-store-reordering.aspx
http://msmvps.com/blogs/luisabreu/archive/2009/07/03/multithreading-introducing-memory-fences.aspx


I don't think anybody has actually answered the question, so I'll give it a try.

The volatile and the first if (instance == null) are not "necessary". The lock will make this code thread-safe.

So the question is: why would you add the first if (instance == null)?

The reason is presumably to avoid executing the locked section of code unnecessarily. While you are executing the code inside the lock, any other thread that tries to also execute that code is blocked, which will slow your program down if you try to access the singleton frequently from many threads. Depending on the language/platform, there could also be overheads from the lock itself that you wish to avoid.

So the first null check is added as a really quick way to see if you need the lock. If you don't need to create the singleton, you can avoid the lock entirely.

But you can't check if the reference is null without locking it in some way, because due to processor caching, another thread could change it and you would read a "stale" value that would lead you to enter the lock unnecessarily. But you're trying to avoid a lock!

So you make the singleton volatile to ensure that you read the latest value, without needing to use a lock.

You still need the inner lock because volatile only protects you during a single access to the variable - you can't test-and-set it safely without using a lock.

Now, is this actually useful?

Well I would say "in most cases, no".

If Singleton.Instance could cause inefficiency due to the locks, then why are you calling it so frequently that this would be a significant problem? The whole point of a singleton is that there is only one, so your code can read and cache the singleton reference once.

The only case I can think of where this caching wouldn't be possible would be when you have a large number of threads (e.g. a server using a new thread to process every request could be creating millions of very short-running threads, each of which would have to call Singleton.Instance once).

So I suspect that double checked locking is a mechanism that has a real place in very specific performance-critical cases, and then everybody has clambered on the "this is the proper way to do it" bandwagon without actually thinking what it does and whether it will actually be necessary in the case they are using it for.


You should use volatile with the double check lock pattern.

Most people point to this article as proof you do not need volatile: https://msdn.microsoft.com/en-us/magazine/cc163715.aspx#S10

But they fail to read to the end: "A Final Word of Warning - I am only guessing at the x86 memory model from observed behavior on existing processors. Thus low-lock techniques are also fragile because hardware and compilers can get more aggressive over time. Here are some strategies to minimize the impact of this fragility on your code. First, whenever possible, avoid low-lock techniques. (...) Finally, assume the weakest memory model possible, using volatile declarations instead of relying on implicit guarantees."

If you need more convincing then read this article on the ECMA spec will be used for other platforms: msdn.microsoft.com/en-us/magazine/jj863136.aspx

If you need further convincing read this newer article that optimizations may be put in that prevent it from working without volatile: msdn.microsoft.com/en-us/magazine/jj883956.aspx

In summary it "might" work for you without volatile for the moment, but don't chance it write proper code and either use volatile or the volatileread/write methods. Articles that suggest to do otherwise are sometimes leaving out some of the possible risks of JIT/compiler optimizations that could impact your code, as well us future optimizations that may happen that could break your code. Also as mentioned assumptions in the last article previous assumptions of working without volatile already may not hold on ARM.


There is a way to implement it without volatile field. I'll explain it...

I think that it is memory access reordering inside the lock that is dangerous, such that you can get a not completelly initialized instance outside of the lock. To avoid this I do this:

public sealed class Singleton
{
   private static Singleton instance;
   private static object syncRoot = new Object();

   private Singleton() {}

   public static Singleton Instance
   {
      get 
      {
         // very fast test, without implicit memory barriers or locks
         if (instance == null)
         {
            lock (syncRoot)
            {
               if (instance == null)
               {
                    var temp = new Singleton();

                    // ensures that the instance is well initialized,
                    // and only then, it assigns the static variable.
                    System.Threading.Thread.MemoryBarrier();
                    instance = temp;
               }
            }
         }

         return instance;
      }
   }
}

Understanding the code

Imagine that there are some initialization code inside the constructor of the Singleton class. If these instructions are reordered after the field is set with the address of the new object, then you have an incomplete instance... imagine that the class has this code:

private int _value;
public int Value { get { return this._value; } }

private Singleton()
{
    this._value = 1;
}

Now imagine a call to the constructor using the new operator:

instance = new Singleton();

This can be expanded to these operations:

ptr = allocate memory for Singleton;
set ptr._value to 1;
set Singleton.instance to ptr;

What if I reorder these instructions like this:

ptr = allocate memory for Singleton;
set Singleton.instance to ptr;
set ptr._value to 1;

Does it make a difference? NO if you think of a single thread. YES if you think of multiple threads... what if the thread is interruped just after set instance to ptr:

ptr = allocate memory for Singleton;
set Singleton.instance to ptr;
-- thread interruped here, this can happen inside a lock --
set ptr._value to 1; -- Singleton.instance is not completelly initialized

That is what the memory barrier avoids, by not allowing memory access reordering:

ptr = allocate memory for Singleton;
set temp to ptr; // temp is a local variable (that is important)
set ptr._value to 1;
-- memory barrier... cannot reorder writes after this point, or reads before it --
-- Singleton.instance is still null --
set Singleton.instance to temp;

Happy coding!