Search code examples
cmultithreadingoperating-systemfork

How does the fork() function behave in if statements?


How is the fork() call working over here? I tried to understand it using print statements but I got more confused with it. I know that !fork() checks if it is a child process meaning fork()==0 & if (fork()) means if it is a parent process. I don't get how we end up with this output. If someone can give explanation, it would clear my doubts about fork calls in if statements.

#include <stdio.h>
#include <unistd.h>

static int x = 0;

int main(int argc, char *argv[])
{
    pid_t p = getpid();
    x++;
    fork();

    if (!fork())
    {

        if (fork())
        {
            x++;

        }
        x++;
    }

    printf("p%d: x = %d\n", getpid() - p, x);
    sleep(60);
    return 0;
}

Output:

p0: x = 1
p1: x = 1
p2: x = 3
p4: x = 2
p3: x = 3
p5: x = 2

Solution

  • Don't think about the fork() calls in the if, think about what fork() actually does.

    fork() is a function that creates a new process by duplicating the calling process (see man 2 fork). That means that when you call fork() and assuming that the call didn't fail, then you will have two processes, the original process and a new one that is a copy of the original one. What does "copy" mean? It means that the memory and the point where the process resumes working are the same (in reality is copy means more, but I try to keep it simple).

    Now that there are 2 processes, both resume working at the same point. But they differ at one crucial point: the return value of the fork() call:

    • the parent (aka original) get the pid of the forked process
    • the child (aka copy) get 0 from the forked process

    That's the only way to distinguish the parent from the child. So when you work with fork, you should use this pattern:

    pid_t pid = fork();
    
    if (pid < 0)
    {
        // ABORT, ERROR
        // no processes created
    }
    
    if(pid == 0)
    {
        // CHILD PROCESS
    } else {
        // PARENT PROCESS
    
        // at some point, parent needs to wait for the child
        waitpid(pid, NULL, 0);  // see man 2 waitpid for more info on that
    }
    

    So, let's take a look at your code:

    • At the first fork() the value of x is 1. Both processes keep running. Let's call the parent process A and the child process B.
    • At the second fork() A and B spawns new processes, let's forget about them for a second. In case of A, the second fork() returns the pid of the next process (C, child of A) and because of the ! the expression evaluates to 0, so the inner inner statement does not get executed for A, which jumps to the printf line which prints 1.
    • The same applies for B as it a copy of A.
    • Now let's consider the second fork().
    • We know that A spawns a second child, C. We know what happened to A afterwards. But the fork() call in C returns 0, so ! evaluates it to 1 and the inner if statement gets executed.
    • Now it calls fork() again, here C spawns D but the fork() call of C gets the pid of D, the if evaluates to true and executed x++ and after that again x++. So when it finally prints the value of x, you get 3. Because we are talking about C, the grandchild of A, getpid() - p is 2.
    • etc.

    I'm not going to keep unraveling the fork() calls, but now that you know how fork() behaves, take a pencil and piece of paper and continue doing it. You'll see that the values in the prints match.

    edit:

    I suggest that you read the man page for fork(). Open a terminal and type

    man 2 fork

    it explains at greater detail how fork() works and what a copy means.

    You should also read the man page of waitpid. You need to call that unless you want to have zombies processes.