Search code examples
c++multithreadingsingletondouble-checked-locking

Is implementation of double checked singleton thread-safe?


I know that the common implementation of thread-safe singleton looks like this:

Singleton* Singleton::instance() {
   if (pInstance == 0) {
      Lock lock;
      if (pInstance == 0) {
         Singleton* temp = new Singleton; // initialize to temp
         pInstance = temp; // assign temp to pInstance
      }
   }
   return pInstance;
}

But why they say that it is a thread-safe implementation?
For example, the first thread can pass both tests on pInstance == 0, create new Singleton and assign it to the temp pointer and then start assignment pInstance = temp (as far as I know, the pointer assignment operation is not atomic).
At the same time the second thread tests the first pInstance == 0, where pInstance is assigned only half. It's not nullptr but not a valid pointer too, which then returned from the function. Can such a situation happen? I didn't find the answer anywhere and seems that it is a quite correct implementation and I don't understand anything


Solution

  • It's not safe by C++ concurrency rules, since the first read of pInstance is not protected by a lock or something similar and thus does not synchronise correctly with the write (which is protected). There is therefore a data race and thus Undefined Behaviour. One of the possible results of this UB is precisely what you've identified: the first check reading a garbage value of pInstance which is just being written by a different thread.

    The common explanation is that it saves on acquiring the lock (a potentially time-expensive operation) in the more common case (pInstance already valid). However, it's not safe.

    Since C++11 and beyond guarantees initialisation of function-scope static variables happens only once and is thread-safe, the best way to create a singleton in C++ is to have a static local in the function:

    Singleton& Singleton::instance() {
       static Singleton s;
       return s;
    }
    

    Note that there's no need for either dynamic allocation or a pointer return type.


    As Voo mentioned in comments, the above assumes pInstance is a raw pointer. If it was std::atomic<Singleton*> instead, the code would work just fine as intended. Of course, it's then a question whether an atomic read is that much slower than obtaining the lock, which should be answered via profiling. Still, it would be a rather pointless exercise, since the static local variables is better in all but very obscure cases.