I have this OpenMP code that performs a simple reduction:
for(k = 0; k < m; k++)
{
#pragma omp parallel for private(i) reduction(+:mysum) schedule(static)
for (i = 0; i < m; i++)
{
mysum += a[i][k] * a[i][k];
}
}
I want to create a code equivalent to this one, but using OpenMP Tasks. Here is what I tried by following this article:
for(k = 0; k < m; k++)
{
#pragma omp parallel reduction(+:mysum)
{
#pragma omp single
{
for (i = 0; i < m; i++)
{
#pragma omp task private(i) shared(k)
{
partialSum += a[i][k] * a[i][k];
}
}
}
#pragma omp taskwait
mysum += partialSum;
}
}
The variable partialSum
is declared as threadprivate
and it's also a global variable:
int partialSum = 0;
#pragma omp threadprivate(partialSum)
a
is a simple array of ints (m x m).
The problem is that when I run the code above (the one with tasks) multiple times, I get different results.
Do you have an idea on what should I change to make this work?
Thank you respectfully
private
variables are uninitialized (at least not initialized by their outside value). i
should be firstprivate
.
If you just get rid of private(i) shared(k)
everything is correct by default. k
comes from outside of the parallel
section and thus is implicitly shared
in the parallel
section. This also makes it implicitly shared
in the task generating construct. Right now i
is also shared/shared. If you define it locally instead, (for (int i...
), it becomes implicitly private
to the parallel
section and thus implicitly firstprivate
in the task generating construct.
You should also add
#pragma omp atomic
mysum += partialSum;
On the other hand, you don't necessarily need the taskwait
(see this answer)
Note that the talk uses firstprivate
correctly.