Search code examples
pythonoperating-systemforksystem-callscycle

Understanding forking within a for loop


I'm learning multiprocessing in Python and I'm having some trouble visualizing os.fork() in my mind. I get the general cases that are thrown around the website, but I'm really having a hard time with this one.

Code is as follows:

import os

for i in range(2):
    pid1 = os.fork()
    pid2 = os.fork()
    if (pid1>0):
            print("A")
    if (pid2==0):
            print("B")

What I know is that 10 As and 10 Bs are printed. I've messed around with print() to find out what the process tree would look like, and I found that it matches my original idea, something like this:

This diagram represents behaviour of the code, where "type" just means if they came from either a "pid1" or a "pid2" line. Green circles represent the first iteration of the cycle, and blue cirles represent the second. Using this diagram I can account for the 10 Bs printed, given that there are 10 processes of type 2 and each one of those prints B. Where I'm having trouble is in understanding which ones print A. Thanks in advance.


Solution

  • This is a tricky problem. The easiest way to think about this is iteratively. The first iteration, how many processes do we have at the beginning? One. How many at the end? Four. Hopefully that's self-explanatory based on how fork works. Let's calculate how many "A"s should be printed when we start with one process in a single iteration.

    Now, I like to think of pid1 and pid2 as "genes", which are sometimes carried over into the child process. For instance, in the first fork, the parent will hold pid1 > 0, the child will hold pid1 == 0. But the second fork, the next generation of children will "inherit" the same values of pid1 as their parents because we don't modify pid1 in the second fork, just pid2.

    So, if we just look at pid1, how many of our 4 processes at the end of iteration one will have pid1 > 0? Only two: the original parent of the first fork and the second child of that parent from the second fork who inherited its pid1 value. Therefore this iteration, only two "A"s will be printed.

    Now, the second iteration, we start off with 4 processes, but the key is each one of those 4 processes will go through the same iteration described above. We already know how many "A"s will be printed per starting process. In a single iteration, one starting process ended up printing two "A"s, the second iteration will result in 8 "A"s printed if each of the 4 starting processes go through one iteration.

    So in total, how many "A"s printed? Two in the first iteration, and eight in the second iteration. That means you should see 10 "A"s printed out for this program.