This is the code for calculating a numerical approximation of the area of the Mandelbrot Set. I don't understand why can't I write reduction(+ : numoutside)
instead of shared(numoutside)
when declaring variables for the parallel region. When I do that, I get incorrect result. Can someone explain the problem with using reduction with OpenMP tasks, because it really confuses me why it doesn't work?
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <omp.h>
#define NPOINTS 2000
#define MAXITER 2000
struct complex
{
double real;
double imag;
};
int main()
{
int i, j, iter, numoutside = 0;
double area, error, ztemp;
struct complex z, c;
double time1, time2, elapsed;
/*
*
*
* Outer loops run over npoints, initialise z=c
*
* Inner loop has the iteration z=z*z+c, and threshold test
*/
time1 = omp_get_wtime();
#pragma omp parallel default(none) private(i, j, iter, c, z, ztemp) \
shared(numoutside) // why not reduction(+ : numoutside) ???
{
#pragma omp single
{
for (i = 0; i < NPOINTS; i++)
{
#pragma omp task
for (j = 0; j < NPOINTS; j++)
{
// #pragma omp task - LOSE NE TREBA OVDE - NAPRAVICE SE PREVISE
// TASKOVA (NPOINTS * NPOINTS) I USPORICEMO SISTEM UMESTO DA UBRZAMO
{
c.real = -2.0 + 2.5 * (double)(i) / (double)(NPOINTS) + 1.0e-7;
c.imag = 1.125 * (double)(j) / (double)(NPOINTS) + 1.0e-7;
z = c;
for (iter = 0; iter < MAXITER; iter++)
{
ztemp = (z.real * z.real) - (z.imag * z.imag) + c.real;
z.imag = z.real * z.imag * 2 + c.imag;
z.real = ztemp;
if ((z.real * z.real + z.imag * z.imag) > 4.0e0)
{
#pragma omp atomic
numoutside++;
break;
}
}
}
}
}
}
}
time2 = omp_get_wtime();
elapsed = time2 - time1;
printf("elapsed %f\n", elapsed);
/*
* Calculate area and error and output the results
*/
area = 2.0 * 2.5 * 1.125 * (double)(NPOINTS * NPOINTS - numoutside) / (double)(NPOINTS * NPOINTS);
error = area / (double)NPOINTS;
printf("Area of Mandlebrot set = %12.8f +/- %12.8f\n", area, error);
}
OpenMP task rules are a bit tricky. A variable that is shared
in the enclosing OpenMP construct is by default still shared
in the task, otherwise it is firstprivate
.
What happens:
numoutside
is declared in a reduction clause, so the compiler create private instances of numoutside
for each thread.single
directive, only one of these instances is used.numoutside
was not shared, it becomes firstprivate
in each tasks: the compiler creates a fully new instance for each task. Whatever you doing on this instance within the task, it is lost once the task ends.Note that declaring numoutside
as shared will work only by chance if you don't protect the numoutside++
instruction with an atomic
directive.
I don't know how to perform reductions with the task contructs (I've seen that there exists a task_reduction
clause in the OmpenMP standard, but I don't know how it works).
But actually, I wonder why you want to use tasks instead of a simple parallel loop...