Search code examples
clinuxunixpidsystems-programming

How to know if a process is a parent or a child


How does one identify if a process is a child/grandchild of another process using its pid?


Solution

  • Process IDs: Child- and parent processes

    All running programs have a unique process ID. The process ID, a non-negative integer, is the only identifier of a process that is always unique. But, process IDs are reused.

    As a process terminates its ID becomes available for reuse. Certain systems delay reuse so that newly created processes are not confused with old ones.

    Certain IDs are "reserved" in the sense that they are being used by system processes, such as the scheduler process. Another example is the init process that always occupies PID 1. Depending on the system the ID might be actively reserved.

    Running the commands

    > ps -eaf | head -n 5
    UID        PID  PPID  C STIME TTY          TIME CMD
    root         1     0  0 11:49 ?        00:00:02 /sbin/init splash
    root         2     0  0 11:49 ?        00:00:00 [kthreadd]
    root         3     2  0 11:49 ?        00:00:00 [ksoftirqd/0]
    root         5     2  0 11:49 ?        00:00:00 [kworker/0:0H]
    

    and

    > pidof init
    1
    

    will allow you to independently verify this.1

    In C we can use the following functions to get the process ID of the calling process and the parent process ID of the calling process,

    #include <unistd.h>
    
    pid_t getpid(void);
    pid_t getppid(void);
    

    A process can create other processes. The created processes are called "child processes" and we refer to the process that created them as the "parent process".

    Creating a new process using fork()

    To create a child process we use the system call fork()

    #include <unistd.h>
    
    pid_t fork(void);
    

    The function is called once, by the parent process, but it returns twice. The return value in the child process is 0, and the return value in the parent process is the process ID of the new child.1

    A process can have multiple child processes but there is no system call for a process to get the process IDs of all of its children, so the parent observes the return value of the child process and can use these identifiers to manage them.

    A process can only have a single parent process, which is always obtainable by calling getppid.

    The child is a copy of the parent, it gets a copy of the parent's data space, heap and stack. They do not share these portions of memory! 2

    We will compile and execute the following code snippet to see how this works,

    #include <stdio.h>
    #include <sys/types.h>
    #include <unistd.h>
    #include <sys/syscall.h>
    
    int main(void) {
        int var = 42; // This variable is created on the stack
        pid_t pid;
    
        // Two processes are created here
        //                 v~~~~~~~~~~|
        if ((pid = fork()) < 0) {
            perror("Fork failed");
        } else if (pid == 0) { // <- Both processes continue executing here
            // This variable gets copied
            var++; 
    
            printf("This is the child process:\n"
                   "\t my pid=%d\n"
                   "\t parent pid=%d\n"
                   "\t var=%d\n", getpid(), getppid(), var);
    
        } else {
            printf("This is the parent process:\n"
                   "\t my pid=%d\n"
                   "\t child pid=%d\n"
                   "\t var=%d\n", getpid(), pid, var);
    
        }
    
    
        return 0;
    }
    

    We will see when we execute the program that there are no guarantees as to which process gets to execute first. They may even operate simultaneously, effectively interleaving their output. 3

    $ # Standard compilation
    $ gcc -std=c99 -Wall fork_example1.c -o fork_example1
    $ # Sometimes the child executes in its entirety first
    $ ./fork_example1
    This is the child process:
         my pid=26485
         parent pid=26484
         var=43
    This is the parent process:
         my pid=26484
         child pid=26485
         var=42
    $ # and sometimes the parent executes in its entirety first
    $ ./fork_example1
    This is the parent process:
         my pid=26461
         child pid=26462
         var=42
    This is the child process:
         my pid=26462
         parent pid=26461
         var=43
    $ # At times the two might interleave
    $ ./fork_example1
    This is the parent process:
         my pid=26455
    This is the child process:
         my pid=26456
         parent pid=26455
         var=43
         child pid=26456
         var=42
    

    1 PID stands for Process ID and PPID stands for Parent Process ID.

    2 The process ID 0 is reserved for use by the kernel, so it is not possible for 0 to be the process ID of a child.

    3 Many systems do not perform a complete copy of these memory segments and instead only creates a copy when either process performs a write. Initially, the shared regions are marked by the kernel as "read-only" and whenever a process tries to modify these regions the kernel awards each process their own copy of that memory.

    4 Standard out is buffered so it's not a perfect example.