I'm trying to send some data to some processes through argv. These processes are created dynamically using MPI. With mpicc (gcc), this works fine. But I tried with Intel's mpiicc and found that it only works if I set the last argument as NULL, like:
for(i=argc; i<5; i++)
argv[i] = malloc(sizeof(char)*10);
for(i=0;i<numProc;i++){
sprintf(argv[2], "%d", vetIni[i+1]);
sprintf(argv[3], "%d", vetEnd[i+1]);
argv[4] = NULL;
MPI_Comm_spawn(bin, argv, 1, localInfo, 0, MPI_COMM_SELF, &interCommFather[i], err);
MPI_Send(&Q[0], N*N, MPI_FLOAT, 0, 99, interCommFather[i]);
}
Also, if I print more argv positions, I see that dozens of arguments are included after null. Is this supposed to happen?
arg 0 -> ./root
arg 1 -> 3
arg 2 -> 96
arg 3 -> 128
arg 4 -> (null)
arg 5 -> MKLROOT=/opt/intel/compilers_and_libraries_2017.1.132/linux/mkl
arg 6 -> LC_PAPER=pt_BR.UTF-8
arg 7 -> MANPATH=/opt/intel/man/common:/opt/intel/documentation_2017/en/debugger//gdb-ia/man/:/opt/intel/documentation_2017/en/debugger//gdb-mic/man/:/opt/intel/documentation_2017/en/debugger//gdb-igfx/man/:/usr/local/man:/usr/local/share/man:/usr/share/man:
arg 8 -> XDG_SESSION_ID=198125
arg 9 -> LC_ADDRESS=pt_BR.UTF-8
arg 10 -> LC_MONETARY=pt_BR.UTF-8
arg 11 -> INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.1.132/linux/licenses:/opt/intel/licenses:/home/adriano/intel/licenses
arg 12 -> IPPROOT=/opt/intel/compilers_and_libraries_2017.1.132/linux/ipp
arg 13 -> TERM=xterm-256color
arg 14 -> SHELL=/bin/bash
arg 15 -> GDBSERVER_MIC=/opt/intel/debugger_2017/gdb/targets/mic/bin/gdbserver
[...]
This solved my problem but it's probably not the right way to solve it. Would anyone know the correct way to solve this situation? If I don't set the last argument to null, I get the following error:
[mpiexec@hype2] fn_spawn (../../pm/pmiserv/pmiserv_pmi_v1.c:893): unable to find token: arg22
[mpiexec@hype2] handle_pmi_cmd (../../pm/pmiserv/pmiserv_cb.c:69): PMI handler returned error
[mpiexec@hype2] control_cb (../../pm/pmiserv/pmiserv_cb.c:957): unable to process PMI command
[mpiexec@hype2] HYDT_dmxu_poll_wait_for_event (../../tools/demux/demux_poll.c:76): callback returned error status
[mpiexec@hype2] HYD_pmci_wait_for_completion (../../pm/pmiserv/pmiserv_pmci.c:501): error waiting for event
[mpiexec@hype2] main (../../ui/mpich/mpiexec.c:1147): process manager error waiting for completion
Thank You.
From the documentation for MPI_Comm_spawn:
In C, the MPI_Comm_spawn argument argv differs from the argv argument of main in two respects. First, it is shifted by one element. Specifically, argv[0] of main contains the name of the program (given by command). argv[1] of main corresponds to argv[0] in MPI_Comm_spawn, argv[2] of main to argv[1] of MPI_Comm_spawn, and so on. Second, argv of MPI_Comm_spawn must be null-terminated, so that its length can be determined.
(emphasis mine)
Not NULL-terminating the array leads to undefined behavior.
Reading extra arguments after the NULL terminator is reading past the length of the array and so is also undefined behavior. It looks like it's reading into the list of environment variables (but this is not guaranteed at all).