I am trying to use the in-place MPI_Allreduce
with the combination of MinGW-w64 gfortran (version 9.2 provided by MSYS64) and Microsoft MPI (version 10),
call MPI_Allreduce(MPI_IN_PLACE, srcdst, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierr)
The standard MPI_Allreduce
(with distinct source and destination) works well, as does the in-place variant when I use C instead of Fortran.
The complete test program test_allreduce.f90 is
program test_allreduce
use iso_fortran_env, only: real64
use mpi
implicit none
integer, parameter :: mpiint = kind(MPI_COMM_WORLD)
integer(mpiint) :: n = 10
integer(mpiint) :: ierr1 = -1, ierr2 = -1, ierr3 = -1, ierr4 = -1
real(real64) :: src(10) = (/ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 /)
real(real64) :: dst(10) = 0
call MPI_Init(ierr1)
call MPI_Allreduce(src, dst, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierr2)
call MPI_Allreduce(MPI_IN_PLACE, src, n, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, ierr3)
call MPI_Finalize(ierr4)
write (*, '(I4)') MPI_IN_PLACE
write (*, '(4I4)') ierr1, ierr2, ierr3, ierr4
write (*, '(10F4.0)') src
write (*, '(10F4.0)') dst
end program
This is how I compile it:
set "PATH=C:\msys64\mingw64\bin;%PATH%"
x86_64-w64-mingw32-gfortran ^
-fno-range-check ^
"C:\Program Files (x86)\Microsoft SDKs\MPI\Include\mpi.f90" ^
test_allreduce.f90 ^
-I . ^
-I "C:\Program Files (x86)\Microsoft SDKs\MPI\Include\x64" ^
-o test_allreduce.exe ^
C:\Windows\System32\msmpi.dll
And this is how I execute it (in single process only so far):
test_allreduce.exe
Currently, it prints
0
0 0 0 0
0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Apparently, the src
buffer gets overwritten by garbage in the second (in-place) call to MPI_Allreduce
.
I saw in the code of mpi.f90
Intel-specific DLLIMPORT directives and even attempted to add analogical
!GCC$ ATTRIBUTES DLLIMPORT :: MPI_IN_PLACE
without any effect.
It turns out that the trouble is that in MSMPI the variable MPI_IN_PLACE
is contained in an internal COMMON
block /MPIPRIV1/
and it is a known bug in gfortran that the compiler fails to properly import COMMON
block variables from DLLs.
Nevertheless, broken things can be fixed, and in the end all that was needed was to apply a patch to gfortran code and compile it from scratch in MSYS2 (phew...), and add the directive
!GCC$ ATTRIBUTES DLLIMPORT :: MPI_BOTTOM, MPI_IN_PLACE
right after implicit none
in the above presented code. (Both these variables seem to be needed in the directive, because MPI_IN_PLACE
is second in the internal COMMON
block just after MPI_BOTTOM
.) Then the in-place MPI_Allreduce
works flawlessly.