Search code examples
cwaitchild-process

How to make child process to wait grandchild process?


I'm trying to make a process creates x child processes and each of that child process creates y child processes. It looks something like this when x=2, y=2

enter image description here

This is what I have done so far,

    pid_t managers[x];
    pid_t workers[y];

    for (int i = 0; i < x; i++) {
        if (i == 0)  // create first manager process
            managers[i] = fork();
        else if (managers[i - 1] > 0)
            managers[i] = fork();
        if (managers[i] == 0)
            for (int j = 0; j < y; j++) {
                if (j == 0)  // create first worker process
                    workers[j] = fork();
                else if (workers[j - 1] > 0)
                    workers[j] = fork();
            }
        // manager process waiting for worker process?
        for (int j = 0; j < y; j++)
            wait(NULL);
    }
    // director process waiting for manager process?
    for (int i = 0; i < x; i++)
        wait(NULL);

    for (int i = 0; i < x; i++)
        printf("managers[%d] = %d\n", i, managers[i]);


and when I run this code with x=2, y=2, the last printf function printed this,

managers[0] = 0
managers[1] = 0
managers[0] = 0
managers[0] = 0
managers[1] = 0
managers[1] = 0
managers[0] = 0
managers[1] = 0
managers[0] = 0
managers[1] = 0
managers[0] = 0
managers[1] = 0
managers[0] = 0
managers[1] = 0
managers[0] = 0
managers[1] = 0
managers[0] = 0
managers[1] = 0
managers[0] = 22728
managers[1] = 0
managers[0] = 22728
managers[1] = 0
managers[0] = 22728
managers[1] = 0
managers[0] = 22728
managers[1] = 22737

I don't understand the behavior of my code, but my guess is I'm using wait function wrong. I used wait() in a way I would use pthread_join() for multiple pthreads, but is this the wrong approach? How do you make manager process wait for its child worker processes and director process wait for manager processes?


Solution

  • The way you are using fork() is wrong. When fork() is called, there will be split of the process inside the kernel, a new PID for the child will be created and the parent will receive the PID of the child, the child will receive 0

    Here is example how to use proper fork:

    if ((pid = fork()) == 0) {
        //I am a child process
    } else {
       // I am the parent
    } 
    

    So in your case, you need to exit the loop if you are the child (and start executing the child code), and, if you are the parent, you need to store the PID and continue the loop.

    By default, only the parent can wait for the child with wait() or waitpid(). If the parent doesn't wait() for the child, the child will become a zombie. Modern kernel will assign such children to PID 1 (usually init(1)) which will clean the zombies. This is know as reaping.

    Some kernels support a system call to ask your PID to be set as sub-reaper which is explained in this post, or in your case, you can write the logic in the intermediate parents to wait for the children.

    Following is an example how to set the parent to be sub-reaper for all the grandchildren:

    prctl(PR_SET_CHILD_SUBREAPER, 1, 0, 0, 0);
    

    You need to issue this system call before you start creating the children and grandchildren, so after this, if an intermediate parent do not collect the exit status with wait(), the child will be assigned to the PID of the parent for reaping.

    Testing zombies and reaping can be challenging due to the automated handling, so I had prepared a small docker container that helps with the tests of the reaping which can be found on gitgub