I'm doing a scholar work and I have to obtain the histogram from a IMAGE.
All is going well, but when I tried to make the code parallel with the OpenMP, the compiler returns me this error: user defined reduction not found for 'histog'
The code that I used is this:
void HistogramaParaleloRed(int *histog)
{
#pragma omp parallel
{
#pragma omp for
for (int i = 0; i < NG; i++)
{
histog[i] = 0;
}
#pragma omp for reduction(+ : histog)
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
histog[IMAGEN[i][j]]++;
}
}
}
}
And the call to the function in Main is: HistogramaParaleloRed(histog_pal_red);
The error
user defined reduction not found for
can happen because either the code was compiled with a compiler that does not support the OpenMP 4.5 array reduction feature (or that compiler is misconfigured) or because your are trying the reduce a naked pointer (like it is the case of your example). In the latter, the compiler cannot tell how many elements are to be reduce.
So either you use a compiler that supports OpenMP 5.0
and take advantage of array sections feature as follows:
void HistogramaParaleloRed(int *histog)
{
#pragma omp parallel
{
#pragma omp for
for (int i = 0; i < NG; i++)
{
histog[i] = 0;
}
#pragma omp for reduction(+ : histog[:N])
for (int i = 0; i < N; i++)
{
for (int j = 0; j < N; j++)
{
histog[IMAGEN[i][j]]++;
}
}
}
}
or alternatively, implement the reduction yourself.
Implement the Reduction manually
One approach is to create a shared structure among threads (i.e., thread_histog), then each thread updates its position, and afterward, threads reduce the values of the shared structure into the original histog array.
void HistogramaParaleloRed(int *histog, int number_threads)
{
int thread_histog[number_threads][NG] = {{0}};
#pragma omp parallel
{
int thread_id = omp_get_thread_num();
#pragma omp for
for (int i = 0; i < N; i++)
for (int j = 0; j < N; j++)
thread_histog[thread_id][IMAGEN[i][j]]++;
#pragma omp for no_wait
for (int i = 0; i < NG; i++)
for(int j = 0; j < number_threads; j++)
histog[i] += thread_histog[j][i]
}
}
Another approach is to create an array of locks, one for each element of the histog
array. Whenever a thread updates a given histog
position, first acquires the lock corresponded to that position so that no other thread will be updating concurrently the same array position.
void HistogramaParaleloRed(int *histog)
{
omp_lock_t locks[NG];
#pragma omp parallel
{
#pragma omp for
for (int i = 0; i < NG; i++)
omp_init_lock(&locks[i]);
int thread_id = omp_get_thread_num();
#pragma omp for
for (int i = 0; i < N; i++)
for (int j = 0; j < N; j++){
int pos = IMAGEN[i][j]
omp_set_lock(&locks[pos]);
thread_histog[thread_id][pos]++;
omp_unset_lock(&locks[pos]);
}
#pragma omp for no_wait
for (int i = 0; i < NG; i++)
omp_destroy_lock(&locks[i]);
}
}