Search code examples
c++multithreadingc++11thread-local-storage

C++ Thread Local Singleton intermittent failure


I've tried to implement a very basic Thread Local Singleton class in C++ - it's a template class that other classes then inherit from. The problem is that it almost always works, but every now and again (say, 1 run in 15), it will fail with an error along the lines of:

* glibc detected * ./myExe: free(): invalid next size (fast): 0x00002b61a40008c0 ***

please forgive the rather contrived example below, but it serves to demonstrate the problem.

#include <thread>
#include <atomic>
#include <iostream>
#include <memory>
#include <vector>

using namespace std;

template<class T>
class ThreadLocalSingleton
{
public:
    /// Return a reference to an instance of the object
    static T& instance();

    typedef unique_ptr<T> UPtr;

protected:
    ThreadLocalSingleton() {}
    ThreadLocalSingleton(ThreadLocalSingleton const&);
    void operator=(ThreadLocalSingleton const&);
};

template<class T>
T& ThreadLocalSingleton<T>::instance()
{
    thread_local T m_instance;
    return m_instance;
}

// Create two atomic variables to keep track of the number of times the
// TLS class is created and accessed.
atomic<size_t> creationCount(0);
atomic<size_t> accessCount(0);

// Very simple class which derives from TLS
class MyClass : public ThreadLocalSingleton<MyClass>
{
    friend class ThreadLocalSingleton<MyClass>;
public:
    MyClass()
    {
        ++creationCount;
    }

    string getType() const
    {
        ++accessCount;
        return "MyClass";
    }
};

int main(int,char**)
{
    vector<thread> threads;
    vector<string> results;

    threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
    threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
    threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
    threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });

    for (auto& t : threads)
    {
        t.join();
    }

    // Expecting 4 creations and 8 accesses.
    cout << "CreationCount: " << creationCount << " AccessCount: " << accessCount << endl;
}

I can replicate this on coliru, using the build command: g++ -std=c++11 -O2 -Wall -pedantic -pthread main.cpp && ./a.out

Many thanks!


Solution

  • Thanks to both molbdnilo and Damon, who quickly pointed out the obvious - vector::emplace_back isn't thread safe, so there would be no guarantees on whether or not this code would actually work. I've replaced the main() function with the following, which seems to be more reliable.

    int main(int,char**)
    {
        vector<thread> threads;
        vector<string> results;
    
        auto addToResult = [&results](const string& val)
        {
            static mutex m_mutex;
            unique_lock<mutex> lock(m_mutex);
            results.emplace_back(val);
        };
    
        threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
        threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
        threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
        threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
    
        for (auto& t : threads)
        {
            t.join();
        }
    
        // Expecting 4 creations and 8 accesses.
        cout << "CreationCount: " << creationCount << " AccessCount: " << accessCount << endl;
    }
    

    Thanks!