I want to use a derived datatype with MPI3 shared memory. Given the following derived datatype:
type ::pve_data
type(pve_data), pointer :: next => NULL()
real*8 , pointer :: array(:) => NULL()
end type pve_data
Initialized with:
type(pve_data) :: pve_grid
allocate(pve_grid%array(10))
pve_grid%array = 0.0d0
After having done some calculation with pve_grid
I want to create a shared memory window with MPI_WIN_ALLOCATE_SHARED
, whose base should be pve_grid
. (Maybe MPI_WIN_CREATE_DYNAMIC
would be an alternative, but I need the performance through shared memory)
1) Till now I just used primitive data types or arrays of those as basepointer for the window creation. Can a derived datatype also be used as basepointer? Or do I need to create a window for every component of the derived datatype, which is a primitive variable?
2) Is it possible to use an already "used" variable (pve_grid
in this case) as basepointer? Or do I need to use a new pve_data
as basepointer and copy the values from pve_grid
to it?
EDIT I know, that it would be easier to use an OpenMP-approach instead of MPI Shared Memory. But I want to try a MPI-only approach on purpose, to improve my MPI-abilities.
EDIT2 (05.09.16) I made some progress and was able to use shared Memory, where the basepointer was a simple integer variable. But I still have a problem when I want to use a derived datatype as basepointer for the window-creation (For testing purpose I changed its definition - see sharedStructMethod.f90 below). The compiler and execution does not throw any error...the remote access simply has no effect on derived datatype components: The write shows the old values, which has been initialized by the parent. The subsequent codes show my current state. I used the possibility to spawn new processes during execution time: the parent-process creates the window and the child-procs execute changes on it. I hope spawning processes does not trouble efforts of debugging, I just added it for my project. (And for the next time I will change real*8 to fit the standard).
Declaration of the derived datatype (sharedStructMethod.f90)
module sharedStructMethod
REAL*8, PARAMETER :: prec=1d-13
INTEGER, PARAMETER :: masterProc = 0
INTEGER, PARAMETER :: SLAVE_COUNT = 2
INTEGER, PARAMETER :: CONSTSIZE = 10
!Struct-Definition
type :: vertex
INTEGER, Dimension(3) :: coords
end type vertex
type :: pve_data
real(kind(prec)), pointer :: intensity(:) => NULL()
logical, pointer :: flag => NULL()
type(vertex), pointer :: vertices(:) => NULL()
end type pve_data
end module sharedStructMethod
Declaration of the parent-process (sharedStruct.f90), which the user executes.
PROGRAM sharedStruct
USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
USE mpi
USE sharedStructMethod
IMPLICIT NONE
type(pve_data) :: pve_grid
integer :: ierror
integer :: myRank, numProcs
INTEGER :: childComm
INTEGER :: childIntracomm
integer :: i
INTEGER(KIND=MPI_ADDRESS_KIND) :: memSize
INTEGER :: dispUnit
TYPE(C_PTR) :: basePtr
INTEGER :: win
TYPE(pve_data), POINTER :: shared_data
call MPI_INIT(ierror)
memSize = sizeof(pve_grid)
dispUnit = 1
CALL MPI_COMM_SPAWN("sharedStructWorker.x", MPI_ARGV_NULL, SLAVE_COUNT, MPI_INFO_NULL, masterProc,MPI_COMM_SELF, childComm,MPI_ERRCODES_IGNORE, ierror);
CALL MPI_INTERCOMM_MERGE(childComm, .false., childIntracomm, ierror)
CALL MPI_WIN_ALLOCATE_SHARED(memSize, dispUnit, MPI_INFO_NULL, childIntracomm, basePtr, win, ierror)
CALL C_F_POINTER(basePtr, shared_data)
CALL MPI_WIN_LOCK(MPI_LOCK_EXCLUSIVE, masterProc,0,win,ierror)
allocate(shared_data%intensity(CONSTSIZE))
allocate(shared_data%vertices(CONSTSIZE))
allocate(shared_data%flag)
shared_data%intensity = -1.0d0
DO i =1,CONSTSIZE
shared_data%vertices(i)%coords(1) = -1
shared_data%vertices(i)%coords(2) = -2
shared_data%vertices(i)%coords(3) = -3
END DO
shared_data%flag = .true.
CALL MPI_WIN_UNLOCK(masterProc, win, ierror)
CALL MPI_BARRIER(childIntracomm, ierror)
CALL MPI_BARRIER(childIntracomm, ierror)
WRITE(*,*) "After: Flag ",shared_data%flag,"intensity(1): ",shared_data%intensity(1)
call mpi_finalize(ierror)
END PROGRAM sharedStruct
And last but not least: The Declaration of the child-process, which is spawned automatically by the parent-proc during runtime, and does change the window content (sharedStructWorker.f90)
PROGRAM sharedStructWorker
USE mpi
USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR, C_F_POINTER
USE sharedStructMethod
IMPLICIT NONE
INTEGER :: ierror
INTEGER :: myRank, numProcs
INTEGER :: parentComm
INTEGER :: parentIntracomm
TYPE(C_PTR) :: pveCPtr
TYPE(pve_data), POINTER :: pve_gridPtr
INTEGER :: win
INTEGER(KIND=MPI_ADDRESS_KIND) :: sizeOfPve
INTEGER :: dispUnit2
CALL MPI_INIT(ierror)
CALL MPI_COMM_GET_PARENT(parentComm, ierror)
CALL MPI_INTERCOMM_MERGE(parentComm, .true., parentIntracomm, ierror)
sizeOfPve = 0_MPI_ADDRESS_KIND
dispUnit2 = 1
CALL MPI_WIN_ALLOCATE_SHARED(sizeOfPve,dispUnit2, MPI_INFO_NULL, parentIntracomm, pveCPtr, win, ierror)
CALL MPI_WIN_SHARED_QUERY(win, masterProc, sizeOfPve, dispUnit2, pveCPtr, ierror)
CALL C_F_POINTER(pveCPtr, pve_gridPtr)
CALL MPI_BARRIER(parentIntracomm, ierror)
CALL MPI_WIN_LOCK(MPI_LOCK_EXCLUSIVE,masterProc,0,win,ierror)
pve_gridPtr%flag = .false.
pve_gridPtr%intensity(1) = 42
CALL MPI_WIN_UNLOCK(masterProc, win, ierror)
CALL MPI_BARRIER(parentIntracomm, ierror)
CALL MPI_FINALIZE(ierror)
END PROGRAM sharedStructWorker
Compilation with:
mpiifort -c sharedStructMethod.f90
mpiifort -o sharedStructWorker.x sharedStructWorker.f90 sharedStructMethod.o
mpiifort -o sharedStruct.x sharedStruct.f90 sharedStructMethod.o
Is this the right approach or do I need to create a shared memory block with its own window for every component of the derived datatype pve_data, which is just a pointer? Thank you for your help!
EDIT 10/09/2016, Solution: Explanation in comments. One way to solve the problem is to generate a window for every component, on which parent and children work. For more complexe derived datatypes this quickly becomes tedious to implement, but it seems that there is no different choice.
I think this is mostly covered here: MPI Fortran code: how to share data on node via openMP?
So, in terms of 1) you can have basepointers that are Fortran derived types. However, the answer to 2) is that MPI_Win_alloc_shared returns storage to you - you cannot reuse existing storage. Given you have a linked list I don't see how this could be converted to a shared window even in principle. To be able to make use of the returned storage it would be much simpler to have an array of pve_data objects - you're going to have to store them consecutively in the returned array so linking them doesn't seem to add anything useful.
I might have misunderstood here - if you only want the head of the list to be remotely accessible in the window then that should be OK.