Search code examples
cmultithreadingparallel-processingopenmpreduction

Array reduction with OpenMP leads to "user defined reduction not found for"


I'm doing a scholar work and I have to obtain the histogram from a IMAGE.

All is going well, but when I tried to make the code parallel with the OpenMP, the compiler returns me this error: user defined reduction not found for 'histog'

The code that I used is this:

void HistogramaParaleloRed(int *histog)
{

    #pragma omp parallel
    {
        #pragma omp for
        for (int i = 0; i < NG; i++)
        {
            histog[i] = 0;
        }

        #pragma omp for reduction(+ : histog)
        for (int i = 0; i < N; i++)
        {
            for (int j = 0; j < N; j++)
            {
                histog[IMAGEN[i][j]]++;
            }
        }
    }
}

And the call to the function in Main is: HistogramaParaleloRed(histog_pal_red);


Solution

  • The error

    user defined reduction not found for
    

    can happen because either the code was compiled with a compiler that does not support the OpenMP 4.5 array reduction feature (or that compiler is misconfigured) or because your are trying the reduce a naked pointer (like it is the case of your example). In the latter, the compiler cannot tell how many elements are to be reduce.

    So either you use a compiler that supports OpenMP 5.0 and take advantage of array sections feature as follows:

    void HistogramaParaleloRed(int *histog)
    {
    
        #pragma omp parallel
        {
            #pragma omp for
            for (int i = 0; i < NG; i++)
            {
                histog[i] = 0;
            }
    
            #pragma omp for reduction(+ : histog[:N])
            for (int i = 0; i < N; i++)
            {
                for (int j = 0; j < N; j++)
                {
                    histog[IMAGEN[i][j]]++;
                }
            }
        }
    }
    

    or alternatively, implement the reduction yourself.

    Implement the Reduction manually

    One approach is to create a shared structure among threads (i.e., thread_histog), then each thread updates its position, and afterward, threads reduce the values of the shared structure into the original histog array.

    void HistogramaParaleloRed(int *histog, int number_threads)
    {
        int thread_histog[number_threads][NG] = {{0}};
        #pragma omp parallel
        {
            int thread_id = omp_get_thread_num();
            #pragma omp for 
            for (int i = 0; i < N; i++)
              for (int j = 0; j < N; j++)
                    thread_histog[thread_id][IMAGEN[i][j]]++;
    
           #pragma omp for no_wait
           for (int i = 0; i < NG; i++)
               for(int j = 0; j < number_threads; j++)
                  histog[i] += thread_histog[j][i]
        }
    }
    

    Another approach is to create an array of locks, one for each element of the histog array. Whenever a thread updates a given histog position, first acquires the lock corresponded to that position so that no other thread will be updating concurrently the same array position.

    void HistogramaParaleloRed(int *histog)
    {
        omp_lock_t locks[NG];
        #pragma omp parallel
        {
           #pragma omp for
           for (int i = 0; i < NG; i++)
                omp_init_lock(&locks[i]);
    
            int thread_id = omp_get_thread_num();
            #pragma omp for 
            for (int i = 0; i < N; i++)
              for (int j = 0; j < N; j++){
                  int pos = IMAGEN[i][j]
                  omp_set_lock(&locks[pos]);
                  thread_histog[thread_id][pos]++; 
                  omp_unset_lock(&locks[pos]);
              }
    
           #pragma omp for no_wait
           for (int i = 0; i < NG; i++)
                omp_destroy_lock(&locks[i]);
        }
    }