Search code examples
c++performanceboostc++11opencover

Is a thread-safe queue a good approach?


I am looking for a way to optimize the following code, for an open source project that I develop, or make it more performant by moving the heavy work to another thread.

void ProfilerCommunication::AddVisitPoint(ULONG uniqueId)
{
    CScopedLock<CMutex> lock(m_mutexResults);
    m_pVisitPoints->points[m_pVisitPoints->count].UniqueId = uniqueId;
    if (++m_pVisitPoints->count == VP_BUFFER_SIZE)
    {
        SendVisitPoints();
        m_pVisitPoints->count=0;
    } 
}

The above code is used by the OpenCover profiler (an open source code coverage tool for .NET written in C++) when each visit point is called. The mutex is used to protect some shared memory (a 64K block shared between several processes 32/64 bit and C++/C#) when full it signals the host process. Obviously this is quite heavy for each instrumentation point and I'd like to make the impact lighter.

I am thinking of using a queue which is pushed to by the above method and a thread to pop the data and populate the shared memory.

Q. Is there a thread-safe queue in C++ (Windows STL) that I can use - or a lock-less queue as I wouldn't want to replace one issue with another? Do people consider my approach sensible?


EDIT 1: I have just found concurrent_queue.h in the include folder - could this be my answer...?


Solution

  • Okay I'll add my own answer - concurrent_queue works very well

    using the details described in this MSDN article I implemented concurrent queue (and tasks and my first C++ lambda expression :) ) I didn't spend long thinking though as it is a spike.

    inline void AddVisitPoint(ULONG uniqueId) { m_queue.push(uniqueId); }
    
    ...
    // somewhere else in code
    
    m_tasks.run([this]
    {
        ULONG id;
        while(true)
        {
             while (!m_queue.try_pop(id)) 
                Concurrency::Context::Yield();
    
            if (id==0) break; // 0 is an unused number so is used to close the thread/task
            CScopedLock<CMutex> lock(m_mutexResults);
            m_pVisitPoints->points[m_pVisitPoints->count].UniqueId = id;
            if (++m_pVisitPoints->count == VP_BUFFER_SIZE)
            {
                SendVisitPoints();
                m_pVisitPoints->count=0;
            }
        }
    });
    

    Results:

    • Application without instrumentation = 9.3
    • Application with old instrumentation handler = 38.6
    • Application with new instrumentation handler = 16.2