Search code examples
fortranmpiintel-fortranmpi-io

MPI_FILE_WRITE_ORDERED overwrites previous written data


I have the following code

program mpi_io

use mpi

implicit none

integer :: mpierr, whoami, nproc, iout, STATUS(MPI_STATUS_SIZE),charsize
integer(kind=mpi_offset_kind):: OFFSET, fs
character(len=60) :: dd,de
character:: newline = NEW_LINE('FORTRAN')


call MPI_INIT        ( mpierr )
call MPI_COMM_RANK   ( MPI_COMM_WORLD, whoami, mpierr )
call MPI_COMM_SIZE   ( MPI_COMM_WORLD, nproc, mpierr )


dd ='=========================' //  INT2STR(whoami)//newline
de = 'special'//  INT2STR(whoami)//newline
call MPI_FILE_OPEN(  MPI_COMM_WORLD, 'test.dat',  MPI_MODE_CREATE + MPI_MODE_WRONLY, MPI_INFO_NULL, IOUT, mpierr)
call mpi_type_size(mpi_byte, charsize , mpierr)


offset = charsize*len(TRIM(de))
if(whoami == 0)call MPI_FILE_WRITE_AT( iout,offset, TRIM(de), len(TRIM(de)), MPI_BYTE, status, mpierr)

call MPI_File_get_size(iout, fs, mpierr)
offset = fs
call MPI_FILE_SEEK(iout, fs, MPI_SEEK_SET, mpierr)
call MPI_FILE_WRITE_ordered( iout,  TRIM(dd), len(TRIM(dd)), MPI_BYTE, status, mpierr)

!call MPI_FILE_WRITE_ordered( iout,  dd, len(dd), MPI_CHARACTER, status, mpierr)

call mpi_file_close(iout,mpierr)

call mpi_finalize(mpierr)

contains

   function INT2STR( i ) result( str )

   integer, intent(in)       :: i
   character(:), allocatable :: str
   character(RANGE(i)+2)     :: tmp

   write(tmp, '(I0)') i
   str = TRIM(tmp)

   end function
end program

What I am seeking for is a combination of writing to file by only one processor and sometimes by them all. As you see in this example, I firstly want to write de by the root rank only, and thereafter dd by all processors.

Write now, it seems that my de is being overwritten

As you can see I tried to offset it by querying the size of the file and do a MPI_FILE_SEEK, but it does not seem to help Does anybody have an idea.

I am using IFORT v19


Solution

  • To quote from the MPI 3.1 standard, from section 13.4.1, positioning subsection: "MPI provides three types of positioning for data access routines: explicit offsets, individual file pointers, and shared file pointers. The different positioning methods may be mixed within the same program and do not affect each other."

    Your problem is that you are mixing all three different positioning methods. MPI_FILE_WRITE_AT uses an explicit offset. Similarly MPI_FILE_SEEK changes the individual file pointer. MPI_FILE_WRITE_ordered writes at the current position given by the shared file pointer. Thus, as "the different positioning methods may be mixed within the same program and do not affect each other", whatever you supply to MPI_FILE_WRITE_AT and to MPI_FILE_SEEK can not affect in any way where MPI_FILE_WRITE_ordered puts data in the file. Thus the first call to MPI_FILE_WRITE_ordered in your program will over-write the data written by MPI_FILE_WRITE_AT.

    What you want is that when you write de you update the shared file pointer. Further as it is done by just one process you do NOT want a collective routine. The correct routine to achieve this is MPI_FILE_WRITE_SHARED. Here is a version of your program that I believe does what you want:

    ijb@ianbushdesktop ~/work/stack $ cat mpiio.f90
    program test_mpi_io
    
      use mpi
    
      implicit none
    
      integer :: mpierr, whoami, nproc, iout, STATUS(MPI_STATUS_SIZE),charsize
      character(len=60) :: dd,de
      character:: newline = NEW_LINE('FORTRAN')
    
    
      call MPI_INIT        ( mpierr )
      call MPI_COMM_RANK   ( MPI_COMM_WORLD, whoami, mpierr )
      call MPI_COMM_SIZE   ( MPI_COMM_WORLD, nproc, mpierr )
    
    
      dd ='=========================' //  INT2STR(whoami)//newline
      de = 'special'//  INT2STR(whoami)//newline
      call MPI_FILE_OPEN(  MPI_COMM_WORLD, 'test.dat',  MPI_MODE_CREATE + MPI_MODE_WRONLY, MPI_INFO_NULL, IOUT, mpierr)
      call mpi_type_size(mpi_byte, charsize , mpierr)
    
    
      if(whoami == 0)call MPI_FILE_WRITE_SHARED( iout,TRIM(de), len(TRIM(de)), MPI_BYTE, status, mpierr)
    
      call MPI_FILE_WRITE_ordered( iout,  TRIM(dd), len(TRIM(dd)), MPI_BYTE, status, mpierr)
    
      call mpi_file_close(iout,mpierr)
    
      call mpi_finalize(mpierr)
    
    contains
    
      function INT2STR( i ) result( str )
    
        integer, intent(in)       :: i
        character(:), allocatable :: str
        character(RANGE(i)+2)     :: tmp
    
        write(tmp, '(I0)') i
        str = TRIM(tmp)
    
      end function INT2STR
    end program test_mpi_io
    
    ijb@ianbushdesktop ~/work/stack $ mpif90 -Wall -Wextra -std=f2003 -O mpiio.f90  -o test_mpi_io
    ijb@ianbushdesktop ~/work/stack $ rm test.dat 
    ijb@ianbushdesktop ~/work/stack $ mpirun -np 4 ./test_mpi_io 
    ijb@ianbushdesktop ~/work/stack $ cat test.dat 
    special0
    =========================0
    =========================1
    =========================2
    =========================3
    ijb@ianbushdesktop ~/work/stack $ rm test.dat 
    ijb@ianbushdesktop ~/work/stack $ mpirun -np 8 ./test_mpi_io 
    ijb@ianbushdesktop ~/work/stack $ cat test.dat 
    special0
    =========================0
    =========================1
    =========================2
    =========================3
    =========================4
    =========================5
    =========================6
    =========================7
    ijb@ianbushdesktop ~/work/stack $ 
    

    Also while I am here you should avoid calling anything in an mpi program a name that begins mpi_. This is because that combination is reserved by mpi, and using it risks a name clash. Hence me renaming your program unit