I attempted to write a Fortran program in which an internal subroutine is called inside a parallel do loop. Because the subroutine is not called anywhere except in this loop, and because the iteration variable i is global, I didn't see the need to pass it to the subroutine. Here's a simplified outline of the program which highlights the problem:
program test
integer :: i
i=37
$omp parallel do private(i)
do i=1,5
call do_work
enddo
$omp end parallel do
contains
subroutine do_work
print *,i
end subroutine do_work
end program test
I'm compiling this program using:
gfortran -O0 -fopenmp -o test test.f90
I compiled it using gfortran 4.4.6 on a machine with 8 cores, and using gfortran 5.4.0 on another machine with 8 cores, and got:
37
37
37
37
37
Of course, when compiled without the -fopenmp flag, I get the expected output:
1
2
3
4
5
So it seems that the pre-loop value of i is what do_work is seeing in every thread. Why does the subroutine not see its thread's local value for i? And why does passing i as an argument to the subroutine resolve the problem? I'm very new to OpenMP, so I apologize if the answer is obvious.
The OpenMP standard does not specify the behaviour of your program.
If you don't pass i
as an argument, and you want i
to be private to each thread both within the construct (the source that physically appears between the parallel and end parallel directives) and within the region (the source that is executed in between those directives, then you need to give i
the OpenMP threadprivate attribute.
Inside the procedure do_work, the variable i
is referenced by host association, and, inside the procedure, it does not appear lexically within the OpenMP construct - hence inside the procedure it is a variable that is referenced in a region but not in a construct.
Ordinarily 2.15.1.2 of OpenMP 4.5 specifies that reference to i
, in the procedure, would be shared.
But because i
is implicitly (because it is a do loop index) and explicitly private within the construct, 2.15.3.3 states that it is unspecified whether references to i
in the region but not in the construct are to the original (shared) item or the private copy.
When you pass i
as an argument "by reference", the dummy argument has the same data sharing attribute as the actual argument - i.e. if you pass i
to the procedure it becomes private.