Search code examples
c++visual-studio-2017openmp

Performance and profiling of OpenMP C++ code in VS107


I have a performance critical piece of C++ code running in Visual Studio 2017 that I've been profiling to look for potential bottlenecks. The profiler at a high level shows about 80% CPU usage across my eight cores executing this code. Having loaded in all the kernel symbols, the profiler shows that the busiest function is NTYieldExecution at 52% usage. NTYieldExecution My guess is that this 52% is not correct, possibly 52% of one thread, but even then I'd be keen to know what's going on under the hood. I also have my own thread pool code which lead to 100% CPU usage on other code, so I'm wondering whether to move this code to an alternative multi-threading model. OpenMP is very convenient, but is it inefficient in Visual Studio 2017? More importantly, is it possible to isolate and remove any such inefficiencies?


Solution

  • The problem as it turned out was that part of the multi-threaded code in this case was inadvertently writing to a variable outside the scope of the OpenMP section which was in turn leading to the automatic insertion of a lock, as seen in the PartialBarrierN::Block. I resolved this by changing this to a more local variable which resulted in a significant speed up and 100% CPU usage.