Search code examples
c++parallel-processingopenmpsimd

error: reduction variable is private in outer context (omp reduction)


I am confused about the data sharing scope of the variable acc in the flowing two cases. In the case 1 I get following compilation error: error: reduction variable ‘acc’ is private in outer context, whereas the case 2 compiles without any issues.

According to this article variables defined outside parallel region are shared.

Why is adding for-loop parallelism privatizing acc? How can I in this case accumulate the result calculated in the the for-loop and distribute a loop's iteration space across a thread team?

case 1

            float acc = 0.0f;
            
            #pragma omp for simd reduction(+: acc)
            for (int k = 0; k < MATRIX_SIZE; k++) {
                float mul = alpha;
                mul *=  a[i * MATRIX_SIZE + k];
                mul *=  b[j * MATRIX_SIZE + k];
                acc += mul;
            }

case 2

            float acc = 0.0f;
            
            #pragma omp simd reduction(+: acc)
            for (int k = 0; k < MATRIX_SIZE; k++) {
                float mul = alpha;
                mul *=  a[i * MATRIX_SIZE + k];
                mul *=  b[j * MATRIX_SIZE + k];
                acc += mul;
            }



Solution

  • Your case 1 is violating OpenMP semantics, as there's an implicit parallel region (see OpenMP Language Terminology, "sequential part") that contains the definition of acc. Thus, acc is indeed private to that implicit parallel region. This is what the compiler complains about.

    Your case 2 is different in that the simd construct is not a worksharing construct and thus has a different definition of the semantics of the reduction clause.

    Case 1 would be correct if you wrote it this way:

    void example(void) {
        float acc = 0.0f;
    
        #pragma omp parallel for simd reduction(+: acc)
        for (int k = 0; k < MATRIX_SIZE; k++) {
            float mul = alpha;
            mul *=  a[i * MATRIX_SIZE + k];
            mul *=  b[j * MATRIX_SIZE + k];
            acc += mul;
        }
    }
    

    The acc variable is now defined outside of the parallel that the for simd construct binds to.