Search code examples
copenmp

malloc in openmp for parallel loop


I am bit confused what is a better way to use malloc()/free() in openmp parallel for loop. Here are two ways I thought of but I am not aware of which method is better. I learned from previous answers that malloc/free in loop can fragment the memory.

Suppose I have a loop which runs over million times

for (size_t i = 0 ; i< 1000000; ++i){
    double * p = malloc(sizeof(double)*FIXED_SIZE); 

    /* FIXED_SIZE is some size constant 
    for the entire loop but is only determined dynamically */

    ....... /* Do some stuff using p array */

    free(p);
}

Now I want to parallelize the above loop with openmp

Method -1. simply adding a pragma on top of for loop

#pragma omp parallel for
for (size_t i = 0 ; i< 1000000; ++i){

    #pragma omp atomic
    double * p = malloc(sizeof(double)*FIXED_SIZE); 
    
    ....... /* Do some stuff using p array */

    #pragma omp atomic
    free(p);
}

Method -2. allocate a common array outside loop for each thread

int num_threads = omp_get_num_threads();
double * p = malloc(sizeof(double)*FIXED_SIZE * num_threads); 

#pragma omp parallel for
for (size_t i = 0 ; i< 1000000; ++i){

    int thread_num = omp_get_thread_num();

    double * p1 = p + FIXED_SIZE*thread_num ;
    
    ....... /* Do some stuff using p1 array */
}
free(p);

Solution

  • First create a parallel block, allocate resource for each thread and next split threads to do a parallel loop.

    #pragma omp parallel
    {
      double * p = malloc(sizeof(double)*FIXED_SIZE);
    
      #pragma omp for
      for (size_t i = 0 ; i< 1000000; ++i) { ... }
    
      free(p);
    }