Search code examples
c++parallel-processingopenmpcritical-section

How to parallelize updating sums with OpenMP


The following loop iterates over all edges of a graph, determines if the end nodes belong to the same group, and then adds the edge weight to the total edge weight of that group.

// TODO: parallel
FORALL_EDGES_BEGIN((*G)) {
    node u = EDGE_SOURCE;
    node v = EDGE_DEST;
    cluster c = zeta[u];
    cluster d = zeta[v];
    if (c == d) {
        // TODO: critical section
        intraEdgeWeight[c] += G->weight(u, v);
    } // else ignore edge
} FORALL_EDGES_END();

I would like to parallelize it with OpenMP. I think that the code in the if statement is a critical section that might lead to a race condition and wrong results if threads are interrupted in the middle of it (correct?).

If the += operation could be made atomically, I believe that the problem is solved (correct?). However, I've looked up the atomic directive and there it states that:

Unfortunately, the atomicity setting can only be specified to simple expressions that usually can be compiled into a single processor opcode, such as increments, decrements, xors and the such. For example, it cannot include function calls, array indexing, overloaded operators, non-POD types, or multiple statements.

What should I use to parallelize this loop correctly?


Solution

  • Actually the accepted syntax for atomic update is:

    x++;

    x--;

    ++x;

    --x;

    x binop= expr;

    x = x binop expr;

    where x is a scalar l-value expression and expr is any expression, including function calls, with the only restriction being that it must be of scalar type. The compiler would take care to store the intermediate result in a temporary variable.

    Sometimes it's better to consult the standards documents instead of reading tutorials on the Internet. Observe the OpenMP 3.1 standard Expample A.22.1c:

    float work1(int i)
    {
      return 1.0 * i;
    }
    ...
    #pragma omp atomic update
    x[index[i]] += work1(i);