Search code examples
c++c++17openmp

Sampling conditional distribution OpenMP


I have a function which draws a random sample:

Sample sample();

I have a function which checks weather a sample is valid:

bool is_valid(Sample s);

This simulates a conditional distribution. Now I want a lot of valid samples (most samples will not be valid).

So I want to parallelize this code with openMP

vector<Sample> valid_samples;
while(valid_samples.size() < n) {
    Sample s = sample();
    if(is_valid(s)) {
        valid_samples.push_back(s);
    }
}

How would I do this? Most of the OpenMP code I found were simple for loops where the number of iterations is determined in the beginning.

The sample() function has a

thread_local std::mt19937_64 gen([](){
    std::random_device d;
    std::uniform_int_distribution<int> dist(0,10000);
    return dist(d);
}());

as a random number generator. Is is valid and thread save if I assume that my device has a source of randomness? Are there better solutions?


Solution

  • You may employ OpenMP task parallelism. The simplest solution would be to define a task as a single sample insertion:

    vector<Sample> valid_samples(n); // need to be resized to allow access in parallel
    
    void insert_ith(size_t i)
    {
      do {
        valid_samples[i] = sample();
      } while (!is_valid(valid_samples[i]));
    }
    
    #pragma omp parallel
    {
      #pragma omp single
      {
        for (size_t i = 0; i < n; i++) 
        {
          #pragma omp task
          insert_ith(i);
        }
      }
    }
    

    Note that there might be performance issues with such single-task-single-insertion mapping. First, there would be false sharing involved, but likely worse, tasking management has some overhead which might be significant for very small tasks. In such a case, a remedy is simple — instead of a single insertion per tasks, insert multiple items at once, such as 100. Usually, a suitable number is a trade-off: the lower creates more tasks = more overhead, the higher may result in worse load balancing.