Search code examples
c++openmpreduction

wrong reduction using openmp


I am using two different versions of reduction in openmp and I get totally different results. Which one of the following is wrong?

    omp_set_num_threads(t);                                                                                        
    long long unsigned int d = 0;                                                                                  
    #pragma omp parallel for default(none) shared(some_stuff) reduction(+:d)               
    for (int i=start; i< n; i++)                                                                   
    {                                                                                                              
            d += calc(i,some_stuff);                                                       
    }                                                                                                              

    cout << d << endl;

and the second version is this:

    omp_set_num_threads(t);
    //reduction array
    long long unsigned int* d = new long long unsigned int[t];
    for(int i = 0; i < t; i++)
            d[i] = 0;

    #pragma omp parallel for default(none) shared(somestuff, d)
    for (int i=start; i< n; i++)
    {                                                                                                              
            long long unsigned dd = calc(i, somestuff);
            d[omp_get_thread_num()] += dd;
    }

    long long unsigned int res = 0;
    for(int i = 0; i < omp_get_num_threads(); i++){
            res += d[i];
    }
    delete[] d;

    cout << res << endl;

Solution

  • The second code is wrong. omp_get_num_threads() returns 1 when called outside a parallel region and therefore your code does not reduce all values into the final result. Since you explicitly fix the number of threads to be t, you should instead use:

    for(int i = 0; i < t; i++){
            res += d[i];
    }
    

    Alternatively, you could use omp_get_max_threads().