Search code examples
fortranmpibinaryfilesmpi-io

Incorrect results when reading binary file with MPI I/O


I am new to MPI and am struggling with reading a binary file. Specifically, I have a $198\times 50 \times 50$ array of integers (16 bit integers, to be specific) stored in a binary file. I want to use 2 compute nodes to process this file. So there are two MPI processes and each process will process half of the input. I am using the function MPI_FILE_READ_AT to read respective regions. I expect the array values to fill in the variable/argument 'bucket' that I pass in to the function call. But a sanity check print out of the 'bucket' entries tells me that the values in bucket are all incorrect. I feel that I am going wrong with the arguments.

program main
use mpi
implicit none

integer :: i, error, num_processes, id, fh
integer(MPI_OFFSET_KIND) :: filesize, offset
integer(MPI_OFFSET_KIND) :: num_bytes_per_process
integer(MPI_OFFSET_KIND) :: num_bytes_this_process
integer ::   num_ints_per_process, num_ints_this_process
integer(kind = 2), dimension(:), allocatable  :: bucket
character(len=100) :: inputFileName
integer, parameter :: INTKIND=2

! Initialize
inputFileName =  'xyz_50x50'
print *, 'MPI_OFFSET_KIND =', MPI_OFFSET_KIND

! MPI basics
call MPI_Init ( error )
call MPI_Comm_size ( MPI_COMM_WORLD, num_processes, error )
call MPI_Comm_rank ( MPI_COMM_WORLD, id, error )

! Open the file
call MPI_FILE_OPEN(MPI_COMM_WORLD, inputFileName, MPI_MODE_RDONLY, &
           MPI_INFO_NULL, fh, error)

! get the size of the file
call MPI_File_get_size(fh, filesize, error)

! Note: filesize is the TOTAL number of bytes in the file
num_bytes_per_process = filesize/num_processes
num_ints_per_process = num_bytes_per_process/INTKIND 
offset = id * num_bytes_per_process

num_bytes_this_process = min(num_bytes_per_process, filesize - offset)
num_ints_this_process = num_bytes_this_process/INTKIND

allocate(bucket(num_ints_this_process))
call MPI_FILE_READ_AT(fh, offset, bucket, num_ints_this_process, &
              MPI_SHORT, MPI_STATUS_SIZE, error)

do i = 1, num_ints_this_process
    if (bucket(i) /= 0) then
       print *, "my id is ", id, " and bucket(",i,")=", bucket(i)
    endif
enddo

! close the file
call MPI_File_close(fh, error)

! close mpi 
call MPI_Finalize(error)

end program main

Solution

  • you have to use MPI_STATUS_IGNORE instead of MPI_STATUS_SIZE (fwiw, i am unable to compile this program unless i fixe this)

    call MPI_FILE_READ_AT(fh, offset, bucket, num_ints_this_process, &
                  MPI_SHORT, MPI_STATUS_IGNORE, error)
    

    note that since all MPI tasks read the file at the same time, you'd rather use the collective MPI_File_read_at_all() subroutine in order to improve performances.