Search code examples
parallel-processingopenmp

OMP loop dependency -


I try to parallelize the following code

double a = 1.5;
double b = 2.5;

double *arr = ...;

for (i = 0; i < N; ++i) 
{
    arr[i] = b;
    b *= a;
}

I have tried different constellations of private / lastprivate / firstprivate

e.g.

#pragma omp parallel for shared(arr) firstprivate(a, b)

but I don't get the right result (comparing to the sequential run)


Solution

  • I assume first you want to avoid the obvious solution where the recurence on b is removed:

    #pragma omp parallel for private(c)
    for (int i = 0; i < N; ++i) {
        c = b * pow(a,(double)i);
        arr[i] = c;
    }
    

    Because of the dependency, your initial code can not be parallelized like this. You have to determine the initial value of b for each thread according to the first iteration it handles (this requires a test inside the loop, not very good for the performances) AND assume that a given thread will execute a contiguous range of iterations (probably a schedule(static,M) clause can do, where M = (N-1)/(nbthreads)+1)

    int nbthreads, M;
    
    #pragma omp parallel firstprivate(b)
    {
        #pragma omp single
        {
            nbthreads = omp_get_num_threads()
            M = (N-1)/nbthreads + 1
        }
        bool first=true;
        #pragma omp for schedule(static,M)
        for (int i = 0; i < N; ++i) {
           if (first) {
               b *= pow(a,(double)i);
               first = false;
           }
           arr[i] = c;
           b *= a;
        }
    }
    

    The right solution actually consists in manually distributing the iterations to the threads, so that you know in advance what is the first iteration of each thread:

    int N=1234, M, nbthreads;
    double b=2.5, a=1.5;
    
    #pragma omp parallel
    {
        #pragma omp single 
        {
            nbthreads = omp_get_num_threads();
            M = (N-1)/nbthreads + 1;
        }
        int it = omp_get_thread_num();
        int lb = it*M;     // lower bound for the iteration in this thread
        int ub = (it+1)*M; // upper bound ...
        if (ub > N) ub = N;
    
        b = b * pow(a,(double)lb);
        for (int i = lb; i < ub; ++i) {
           arr[i] = b;
           b *= a;
        }
    }