I have the following monte carlo code with openmp
multithreading
int main()
{
std::uniform_int_distribution<int> dir(0, 1);
std::uniform_int_distribution<int> xyz(0, 2);
std::uniform_real_distribution<double> angle(0,360);
mt19937 gen{0};
auto M = 20;
long double sum = 0;
auto num_trials = 10000;
// omp_set_num_threads(12);
#pragma omp parallel
{
double loc_sum = 0.0;
#pragma omp for
for(int i = 0; i < num_trials; ++i)
{
double x = 0;
double y = 0;
double z = 0;
double r = 0;
auto N = 0;
while(r < M)
{
auto d = dir(gen);
auto p = xyz(gen);
if(p == 0)
{
x += (d == 1) ? -1 : 1;
}
else if(p == 1)
{
y += (d == 1) ? -1 : 1;
}
else
{
z += (d == 1) ? -1 : 1;
}
r = std::sqrt(x * x + y * y + z * z);
++N;
}
loc_sum += N;
}
#pragma omp critical
sum += loc_sum;
}
}
The variable sum
is quite different for executing in serial vs multiple threads. I am expecting a slight difference due to the calls to the random uniform distributions, but the difference that I am observing is too significant and can't be due to randomness, and I suspect there's a bug in my multithreaded code.
Are there any race conditions or data races in this code that's affecting sum
?
The issue is that you call the generators (dir
and xyz
) without a lock around them. You're also using the PRNG (gen
) without locking.
Neither of those calls is atomic, since implementing them as such by default would make single threaded code slower than it needs to be.
Marking the lines to generate d
and p
with #pragma omp critical
should fix the issue.
If you don't want a critical section, you'll need seperate dir
, xyz
, and gen
objects in each thread. The generators (dir
and xyz
) can simply be copied. The PRNG (gen
) should be properly initialized per thread, otherwise you end up with PNGs with the exact same state in each thread. E.g.:
std::random_device rd; /* Outside the parallel section. */
// Code below once per thread.
/* Initialization of the PRNG calls std::random_device::operator() which
* needs a lock around it when called in parallel. */
std::mt199937 gen;
#pragma omp critical
{
gen.seed(rd());
}
// for-loop starts here.