I am looking for a way to optimize the following code, for an open source project that I develop, or make it more performant by moving the heavy work to another thread.
void ProfilerCommunication::AddVisitPoint(ULONG uniqueId)
{
CScopedLock<CMutex> lock(m_mutexResults);
m_pVisitPoints->points[m_pVisitPoints->count].UniqueId = uniqueId;
if (++m_pVisitPoints->count == VP_BUFFER_SIZE)
{
SendVisitPoints();
m_pVisitPoints->count=0;
}
}
The above code is used by the OpenCover profiler (an open source code coverage tool for .NET written in C++) when each visit point is called. The mutex is used to protect some shared memory (a 64K block shared between several processes 32/64 bit and C++/C#) when full it signals the host process. Obviously this is quite heavy for each instrumentation point and I'd like to make the impact lighter.
I am thinking of using a queue which is pushed to by the above method and a thread to pop the data and populate the shared memory.
Q. Is there a thread-safe queue in C++ (Windows STL) that I can use - or a lock-less queue as I wouldn't want to replace one issue with another? Do people consider my approach sensible?
EDIT 1: I have just found concurrent_queue.h in the include folder - could this be my answer...?
Okay I'll add my own answer - concurrent_queue works very well
using the details described in this MSDN article I implemented concurrent queue (and tasks and my first C++ lambda expression :) ) I didn't spend long thinking though as it is a spike.
inline void AddVisitPoint(ULONG uniqueId) { m_queue.push(uniqueId); }
...
// somewhere else in code
m_tasks.run([this]
{
ULONG id;
while(true)
{
while (!m_queue.try_pop(id))
Concurrency::Context::Yield();
if (id==0) break; // 0 is an unused number so is used to close the thread/task
CScopedLock<CMutex> lock(m_mutexResults);
m_pVisitPoints->points[m_pVisitPoints->count].UniqueId = id;
if (++m_pVisitPoints->count == VP_BUFFER_SIZE)
{
SendVisitPoints();
m_pVisitPoints->count=0;
}
}
});
Results: