c++multithreading performance openmp mutex

Difference between std::lock_guard and #pragma omp critical

Let's consider some code to safely increment a variable in a for loop with multiple threads.

To achieve this you have to use some kind of lock mechanism when incrementing the variable. When I was searching for a solution I came up with the following to solutions.

My questions are:

Are they equally good or does one of them has some fallbacks?
When to use a mutex instead of #pragma omp critical?

#include <iostream>
#include <mutex>

int main(int argc, char** argv)
{
    int someVar = 0;
    std::mutex someVar_mutex;

    #pragma omp parallel for
    for (int i = 0; i < 1000; i++)
    {
        std::lock_guard<std::mutex> lock(someVar_mutex);
        ++someVar;
    }

    std::cout << someVar << std::endl;

    return 0;
}

#include <iostream>

int main(int argc, char** argv)
{
    int someVar = 0;

    #pragma omp parallel for
    for (int i = 0; i < 1000; i++)
    {
        #pragma omp critical
        ++someVar;
    }

    std::cout << someVar << std::endl;

    return 0;
}

Solution

The critical section serves the same purpose as acquiring a lock (and will probably use a lock internally).

std::mutex is standard C++ feature whereas #pragma omp critical is an OpenMP extension and not defined by the standard.
The critical section names are global to the entire program (regardless of module boundaries). So if you have a critical section by the same name in multiple modules, not two of them can be executed at the same time. If the name is omitted, a default name is assumed. (docs).

Would prefer standard C++, unless there is a good reason to use the other (after measuring both).

Not direct targeting the question, but there is also another problem with this loop: the lock is executed on each loop iteration. This degrades performance significantly (look also at this answer).