Search code examples
thread-safetymontecarloprngopenmp

Correct OpenMP pragmas for pi monte carlo in C with not thread-safe random number generator


I need some help to parallelize the pi calculation with the monte carlo method with openmp by a given random number generator, which is not thread safe.

First: This SO thread didn't help me.

My own try is the following #pragma omp statements. I thought the i, x and y vars should be init by each thread and should than be private. z ist the sum of all hits in the circle, so it should be summed after the implied barriere after the for loop.

Think the main problem ist the static state var of the random number generator. I made a critical section where the functions are called, so that only one thread per time could execute it. But the Pi solutions doesn't scale with more higher values.

Note: I should not use another RNG, but its okay to make little changes on it.

int main (int argc, char *argv[]) {

    int i, z = 0, threads = 8, iters = 100000;
    double x,y, pi;

    #pragma omp parallel firstprivate(i,x,y) reduction(+:z) num_threads(threads)
        for (i=0; i<iters; ++i) {
            #pragma omp critical
            {
                x = rng_doub(1.0);
                y = rng_doub(1.0);
            }
            if ((x*x+y*y) <= 1.0)
                z++;
        }

    pi = ((double) z / (double) (iters*threads))*4.0;
    printf("Pi: %lf\n", pi);;
    return 0;
}

This RNG is actually an included file, but as I'm not sure if I create the header file correct, I integrated it in the other program file, so I have only one .c file.

#define RNG_MOD 741025

int rng_int(void) {
    static int state = 0;

    return (state = (1366 * state + 150889) % RNG_MOD);
}

double rng_doub(double range) {
    return ((double) rng_int()) / (double) ((RNG_MOD - 1)/range);
}

I've also tried to make the static int state global, but it doesn't change my result, maybe I done it wrong. So please could you help me make the correct changes? Thank you very much!


Solution

  • Try the code below. It makes a private state for each thread. I did something similar with the at rand_r function Why does calculation with OpenMP take 100x more time than with a single thread?

    Edit: I updated my code using some of Hristo's suggestions. I used threadprivate (for the first time). I also used a better rand function which gives a better estimate of pi but it's still not good enough.

    One strange things was I had to define the function rng_int after threadprivate otherwise I got an error "error: 'state' declared 'threadprivate' after first use". I should probably ask a question about this.

    //gcc -O3 -Wall -pedantic -fopenmp main.c
    #include <omp.h>
    #include <stdio.h>
    
    #define RNG_MOD 0x80000000
    int state;
    
    int rng_int(void);
    double rng_doub(double range);
    
    int main() {
        int i, numIn, n;
        double x, y, pi;
    
        n = 1<<30;
        numIn = 0;
    
        #pragma omp threadprivate(state)
        #pragma omp parallel private(x, y) reduction(+:numIn) 
        {
    
            state = 25234 + 17 * omp_get_thread_num();
            #pragma omp for
            for (i = 0; i <= n; i++) {
                x = (double)rng_doub(1.0);
                y = (double)rng_doub(1.0);
                if (x*x + y*y <= 1) numIn++;
            }
        }
        pi = 4.*numIn / n;
        printf("asdf pi %f\n", pi);
        return 0;
    }
    
    int rng_int(void) {
       // & 0x7fffffff is equivalent to modulo with RNG_MOD = 2^31
       return (state = (state * 1103515245 + 12345) & 0x7fffffff);
    }
    
    double rng_doub(double range) {
        return ((double)rng_int()) / (((double)RNG_MOD)/range);
    }
    

    You can see the results (and edit and run the code) at http://coliru.stacked-crooked.com/a/23c1753a1b7d1b0d