I was trying to using taskloop to finding sum of element in array.
#include <iostream>
#include <omp.h>
using namespace std;
#define NUMTHREAD 6
#define SIZE 100
int main() {
int* n = new int [SIZE];
for (int i = 0; i < SIZE; ++i) {
n[i] = 1;
}
long sum =0;
#pragma omp parallel num_threads(NUMTHREAD)
{
#pragma omp taskloop
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
cout << sum<< "\n";
return 0;
}
but output is
600
It's should be 100
I don't know why sum is shared.
I was trying to made sum not shared and for each thread.
#pragma omp parallel num_threads(NUMTHREAD)
{
#pragma omp taskloop private(sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
but Output is
0
IDK why?
taskloop
construct is a task-generating construct, not a worksharing-loop construct. You can read the following in OpenMP specification:The taskloop construct is a task generating construct. When a thread encounters a taskloop construct, the construct partitions the iterations of the associated loops into explicit tasks for parallel execution.
In your case 6 threads encounter the taskloop construct, therefore the loop is executed 6 times, and you got a bigger result than expected. To correct it you have to i) make sure that only a single thread executes the tasks generating construct, or ii) use a worksharing-loop construct.
There is a race condition at line sum += n[i];
, as different threads update sum
simultaneously. To correct it the best option is to use reduction (reduction(+:sum)
)
In the second case the sharing attribute of sum
is private, so the variable sum
will be privatized. It means that a local variable is created, and only the value of the local variable is changed. It does not change the value of sum
in the enclosing context, therefore you got 0
.
To sum up, here are the 2 alternatives mentioned above to correct your code:
i) using single
construct and taskloop
:
#pragma omp parallel num_threads(NUMTHREAD)
#pragma omp single nowait
{
#pragma omp taskloop reduction(+:sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
ii) using a worksharing-loop construct:
#pragma omp parallel num_threads(NUMTHREAD)
{
#pragma omp for reduction(+:sum)
for (int i = 0; i < SIZE; i++) {
sum += n[i];
}
}
Some minor, off-topic comments:
new
operator) is never freed, you should use delete
to deallocate it. As pointed out by @VictorEijkhout it is even better if you do not use new
at all. Use vector
or array
from the std library.const/constexpr int SIZE=100;
over #define SIZE 100
. More details can be found e.g. here.