I expect to get the following output:
My rank is: 0 num is: 0
My rank is: 1 num is: 1
My rank is: 2 num is: 2
My rank is: 3 num is: 3
from the following code:
#pragma omp parallel
{
int my_rank = omp_get_thread_num();
#pragma omp parallel for num_threads(4)
for(int i = 0; i < 4; i++){
printf("My rank is: %d num is: %d\n",my_rank, i);
}
}
But it gives following output:
My rank is: 0 num is: 0
My rank is: 0 num is: 1
My rank is: 0 num is: 2
My rank is: 0 num is: 3
My rank is: 2 num is: 0
My rank is: 2 num is: 1
My rank is: 2 num is: 2
My rank is: 2 num is: 3
My rank is: 3 num is: 0
My rank is: 3 num is: 1
My rank is: 3 num is: 2
My rank is: 3 num is: 3
My rank is: 1 num is: 0
My rank is: 1 num is: 1
My rank is: 1 num is: 2
My rank is: 1 num is: 3
What is the problem?
You should not repeat parallel
, you are already inside a parallel
block, so you only need pragma omp for
for the loop, and each thread executing the parallel
block will automatically take a chunk of the loop if you specify pragma omp for
. If you want to specify the number of threads you can do pragma omp parallel num_threads(4)
and then pragma omp for
. In any case for such a simple piece of code you can just drop the entire outer block which seems unneeded.
Here's the correct version:
#pragma omp parallel num_threads(4)
{
int my_rank = omp_get_thread_num();
#pragma omp for
for(int i = 0; i < 4; i++){
printf("My rank is: %d num is: %d\n", my_rank, i);
}
}
Or simply:
#pragma omp parallel for num_threads(4)
for(int i = 0; i < 4; i++){
printf("My rank is: %d num is: %d\n", omp_get_thread_num(), i);
}