I am trying to simply parallel a MD code with openmp. However, even if I use one thread with openmp. The result is not correct. If I commended out the omp part of the code the result is correct. I think the problem is the wrong use of shared and private variables in omp. Can anyone help me with this problem? Thank you very much! Code is below
int i, j;
// parallelization using openmp
#pragma omp parallel shared(fx2,fy2,fz2,rx,ry,rz,N_molecular,L) private(i, j, xmin, ymin, zmin,rmin2, f)
{
#pragma omp for// reduction(+:fx2,fy2,fz2)
/*calculate all pair forces acting on each particles again*/
for(i = 0; i < N_molecular-1; i++){
for(j = i+1; j < N_molecular; j++){
xmin = rx[i] - rx[j] - L*round ((rx[i] - rx[j])/L);
ymin = ry[i] - ry[j] - L*round ((ry[i] - ry[j])/L);
zmin = rz[i] - rz[j] - L*round ((rz[i] - rz[j])/L);
rmin2 = xmin*xmin + ymin*ymin + zmin*zmin;
if(rmin2 < rcut*rcut)
{
f = 48./pow(rmin2,7) - 24./pow(rmin2,4);
//#pragma omp atomic update
fx2[i] += f*xmin;
//#pragma omp atomic update
fy2[i] += f*ymin;
//#pragma omp atomic update
fz2[i] += f*zmin;
//#pragma omp atomic update
fx2[j] = fx2[j] - f*xmin;
//#pragma omp atomic update
fy2[j] = fy2[j] - f*ymin;
//#pragma omp atomic update
fz2[j] = fz2[j] - f*zmin;
}
}
}
}
The tricky part in parallelising this code is to make sure that the updates for the force arrays are not generating race conditions. Here, when done using the i
index, no issue will occur since different threads will handle different i
indexes. However, this isn't true for the j
indexes and that is enough to prevent you from using this approach.
A solution would be to store your forces in some local per-thread arrays, and to accumulate the final individual results in the global arrays upon exit of the parallel region. Other possibilities exist, which I describe in (I hope) details in this answer to a question very similar to yours.