Search code examples
cmultithreadingopenmprace-condition

openmp reduction does not provide the same answer as the sequential methodd


I am trying to parallelize a vector dot product program using OpenMP. The following code shows what I did.

#define N 1000000

float dotProduct = 0;
float vector1Host[N], vector2Host[N]; //each element in the vectors are initialized to a value between 1 and 2

#pragma omp parallel for private(i)  reduction(+:dotProduct)
        for (i = 0; i < N; i++)
                dotProduct += vector1Host[i] * vector2Host[i];

The answer I get here is slightly different than what I get when I do the multiplication sequentially. Further, when I remove the reduction(+:dotProduct) and calculate the multiplications of each item seperately and add them together later (sequentially) I get the same answer as the completely sequential method.

float productComponents[N];

#pragma omp parallel for private(i)
     for (i = 0; i < N; i++)
            productComponents[i] += vector1Host[i] * vector2Host[i];

for (i=0; i<N; i++)
     dotProduct += productComponents[i];

The issue with this method is the performance. Please help me in finding the error in the first method, or an alternative method with good performance.

Update: I added the output from a sample run.

N=1000000: Ans=2251335.750000: Time(ms)=2.59163 //sequential
N=1000000: Ans=2251356.750000: Time(ms)=0.65846 //openmp


Solution

  • Floating point operations are not commutative. Therefore it is possible that your code is giving differing and possibly unpredictable results based on the order in which the floats are added to the accumulating variable.

    Openmp due to the nature of parallelising the code results in the additions being performed in an arbitrary order and thus causes slightly unpredictable value due to the above non commutative behaviour of floats.

    Either you need to accept this unpredictability or serialise the additions.

    The other option would be to use a fixed point library which was able to guarantee commutative addition, in which case the answer would be predictable regardless of the resulting order of the additions.