I am bit confused what is a better way to use malloc()/free() in openmp parallel for loop. Here are two ways I thought of but I am not aware of which method is better. I learned from previous answers that malloc/free in loop can fragment the memory.
Suppose I have a loop which runs over million times
for (size_t i = 0 ; i< 1000000; ++i){
double * p = malloc(sizeof(double)*FIXED_SIZE);
/* FIXED_SIZE is some size constant
for the entire loop but is only determined dynamically */
....... /* Do some stuff using p array */
free(p);
}
Now I want to parallelize the above loop with openmp
Method -1. simply adding a pragma on top of for loop
#pragma omp parallel for
for (size_t i = 0 ; i< 1000000; ++i){
#pragma omp atomic
double * p = malloc(sizeof(double)*FIXED_SIZE);
....... /* Do some stuff using p array */
#pragma omp atomic
free(p);
}
Method -2. allocate a common array outside loop for each thread
int num_threads = omp_get_num_threads();
double * p = malloc(sizeof(double)*FIXED_SIZE * num_threads);
#pragma omp parallel for
for (size_t i = 0 ; i< 1000000; ++i){
int thread_num = omp_get_thread_num();
double * p1 = p + FIXED_SIZE*thread_num ;
....... /* Do some stuff using p1 array */
}
free(p);
First create a parallel block, allocate resource for each thread and next split threads to do a parallel loop.
#pragma omp parallel
{
double * p = malloc(sizeof(double)*FIXED_SIZE);
#pragma omp for
for (size_t i = 0 ; i< 1000000; ++i) { ... }
free(p);
}