Search code examples
fortranmpiopenmpiderived-types

mpi_gather doesn't return entire vector with fortran derived datatype


I'm running into an issue where mpi_gather only returns a small subset of the vector that I'm trying to pass. Note, I'm running this with np 1, but it also happens with np 2 and np 3. NAT = 3 (nat = number of atoms), and there are 194 unique pairs.

To make this happen, I have two derived data types in fortran:

type dtlrdh_lut
    sequence
    integer p
    integer q
    integer ind
    real(dp), dimension(3, 3) :: TLR
    real(dp), dimension(3, 3, 3, 3) :: dTLRdh
end type dtlrdh_lut

In my subroutine, I have defined my variables like so:

    type(dtlrdh_lut), dimension(:), allocatable :: my_dtlrdh, collected_dtlrdh
    integer :: dh_dtype, dr_dtype, dh_types(5), dr_types(6), dh_blocks(5), dr_blocks(6)
    INTEGER(kind=MPI_ADDRESS_KIND) :: dh_offsets(5), dr_offsets(6)
    integer :: numtasks, rank, ierr, dh_displs(nproc_image), dr_displs(nproc_image)
    integer :: n, status(mpi_status_size)

I split up the work between processes in another method and then count the number of elements of the lookup table need to be computed and allocated the local lookup tables on this specific node like so:

    my_num_pairs = 0
    do i = 1, num_pairs, 1
        if(unique_pairs(i)%cpu.eq.me_image) then
            my_num_pairs = my_num_pairs + 1
            end if
        end do
    if(.not.allocated(my_dtlrdh))    allocate(my_dtlrdh(my_num_pairs))

I also allocate and zero out the lookup table that everything will be combined back into with the following code: if(me_image.eq.root_image) then if(.not.allocated(collected_dtlrdh)) allocate(collected_dtlrdh(num_pairs))

        do i=1,my_num_pairs,1
            collected_dtlrdh(i)%p = 0
            collected_dtlrdh(i)%q = 0
            collected_dtlrdh(i)%ind = 0
            collected_dtlrdh(i)%TLR = 0.0_DP
            collected_dtlrdh(i)%dTLRdh = 0.0_DP
            end do
        end if

I then fill in the lookup table, but I'll skip that code. It's long and not relevant. With this done, it's time to start the MPI process to gather all back on the root process.

    call mpi_get_address(my_dtlrdh(1)%p,               dh_offsets(1), ierr)
    call mpi_get_address(my_dtlrdh(1)%q,               dh_offsets(2), ierr)
    call mpi_get_address(my_dtlrdh(1)%ind,             dh_offsets(3), ierr)
    call mpi_get_address(my_dtlrdh(1)%TLR(1,1),        dh_offsets(4), ierr)
    call mpi_get_address(my_dtlrdh(1)%dTLRdh(1,1,1,1), dh_offsets(5), ierr)
    do i = 2, size(dh_offsets)
      dh_offsets(i) = dh_offsets(i) - dh_offsets(1)
    end do
    dh_offsets(1) = 0
    dh_types = (/MPI_INTEGER, MPI_INTEGER, MPI_INTEGER, MPI_DOUBLE_PRECISION, MPI_DOUBLE_PRECISION/)
    dh_blocks = (/1, 1, 1, 3*3, 3*3*3*3/)
    call mpi_type_struct(5, dh_blocks, dh_offsets, dh_types, dh_dtype, ierr)
    call mpi_type_commit(dh_dtype, ierr)

I then call gather via:

    call mpi_gather(my_dtlrdh, my_num_pairs, dh_dtype, &
                     collected_dtlrdh, my_num_pairs, dh_dtype, &
                     root_image, intra_image_comm, ierr)

After I gather, I can then print out what everything should look like:

    do i = 1, num_pairs, 1
        write(stdout, *) my_dtlrdh(i)%p, collected_dtlrdh(i)%p, my_dtlrdh(i)%q, collected_dtlrdh(i)%q
        end do

and this is where I see really strange information. The first few elements that are printed out look fine:

       1           1           3           3
       1           1           6           6
       1           1           9           9

But the tail end of my vector looks like where I only send 174 elements instead of the full 194:

      17           0          24           0
      18           0          19           0
      18           0          20           0
      18           0          21           0
      18           0          22           0

Given that there are 194 pairs, and both num_pairs and my_num_pairs equal 194, I'm confused. I went ahead and started to play around in desperation, and found that if I use this gather call instead of the one above, I get the full vector:

    num_pairs = 2*num_pairs+40
    call mpi_gather(my_dtlrdh, num_pairs, dh_dtype, &
                     collected_dtlrdh, num_pairs, dh_dtype, &
                     root_image, intra_image_comm, ierr)

which I found by just guess and check. While this may work for this small test system, it hardly seems like a scalable solution. I'm totally at a loss... Any ideas? And please, let me know if you guys need any more information from me.


Solution

  • MPI_TYPE_STRUCT is deprecated in favour of MPI_TYPE_CREATE_STRUCT. The latter takes conceptually the same arguments as the former but the array of offsets is of type INTEGER(KIND=MPI_ADDRESS_KIND), i.e. the type returned by MPI_GET_ADDRESS.