Search code examples
c++openmpreduction

OpenMP: nowait and reduction clauses on the same pragma


I am studying OpenMP, and came across the following example:

#pragma omp parallel shared(n,a,b,c,d,sum) private(i)
{
    #pragma omp for nowait
    for (i=0; i<n; i++)
        a[i] += b[i];

    #pragma omp for nowait
    for (i=0; i<n; i++)
        c[i] += d[i];
    #pragma omp barrier

    #pragma omp for nowait reduction(+:sum)
    for (i=0; i<n; i++)
        sum += a[i] + c[i];
} /*-- End of parallel region --*/

In the last for loop, there is a nowait and a reduction clause. Is this correct? Doesn't the reduction clause need to be syncronized?


Solution

  • The nowaits in the second and last loop are somewhat redundant. The OpenMP spec mentions nowait before the end of the region so perhaps this can stay in.

    But the nowait before the second loop and the explicit barrier after it cancel each other out.

    Lastly, about the shared and private clauses. In your code, shared has no effect, and private simply shouldn’t be used at all: If you need a thread-private variable, just declare it inside the parallel region. In particular, you should declare loop variables inside the loop, not before.

    To make shared useful, you need to tell OpenMP that it shouldn’t share anything by default. You should do this to avoid bugs due to accidentally shared variables. This is done by specifying default(none). This leaves us with:

    #pragma omp parallel default(none) shared(n, a, b, c, d, sum)
    {
        #pragma omp for nowait
        for (int i = 0; i < n; ++i)
            a[i] += b[i];
    
        #pragma omp for
        for (int i = 0; i < n; ++i)
            c[i] += d[i];
    
        #pragma omp for nowait reduction(+:sum)
        for (int i = 0; i < n; ++i)
            sum += a[i] + c[i];
    } // End of parallel region