Search code examples
mpispawn

MPI_Comm_spawn from a process other than rank 0


I'm trying to make the following :

1-Run the code with mpirun -np 2 xxx
2-Rank 1 process spawns 2 slaves (spawn_example code found online)

When i tried to spawn from rank 0 it worked, but from rank 1 it hangs and keeps on waiting until i stop the execution with ctrl+c

Here's the code, if you run with -np 1 it finishes normally, but with -np 2 it hangs :

#include<string.h>
#include<stdlib.h>
#include<stdio.h>
#include"mpi.h"


int main(int argc, char **argv)
{

   int tag = 1;
   int tag1 = 2;
   int tag2 = 3;
   int my_rank;
   int num_proc;

   int array_of_errcodes[10];

   int i;

   MPI_Status      status;
   MPI_Comm        inter_comm;



  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
  MPI_Comm_size(MPI_COMM_WORLD, &num_proc);


if(my_rank==0)
{
 printf("I'm process rank %d \n ",my_rank);
}

if(my_rank==1)
{
 printf("Rank %d process is spawning 2 slaves \n",my_rank);

 MPI_Comm_spawn("spawn_example", MPI_ARGV_NULL, 2, MPI_INFO_NULL,1, MPI_COMM_WORLD, &inter_comm, array_of_errcodes);    
}

 MPI_Finalize();
 exit(0);
}

I don't know what i'm doing wrong , i'd like to know how to make it possible so other ranks can spawn their slaves and eventually exchange data.
Thanks.
Edit1: i added the full code , if you need the spawn_example i can provide a link to it.


Solution

  • This MPI API doesn't do what you think. MPI_Comm_Spawn is a collective call and creates a child MPI job with n processes (assuming the current job has n processes). You need to call it from all processes (remove the if).Here is an example.

    And to answer your question: why it works from root and not from others? Because of the implementation. Only root process performs the task of spawning the processes.