Search code examples
perlfork

Kill children when the parent is killed


Is there an easy way for Perl to kill child processes when the parent is killed? When I run kill on a parent PID the children stay alive.

Test script:

#!/usr/bin/perl

@sleeps = qw( 60 70 80 );
@pids   = ();

foreach $sleeptime (@sleeps) {

    my $pid = fork();

    if ( not defined $pid ) {
        die;
    }
    elsif ( $pid == 0 ) {

        #child
        system("sleep $sleeptime");
        exit;
    }
    else {

        #parent
        push @pids, $pid;
    }
}

foreach $pid (@pids) {
    waitpid( $pid, 0 );
}

Solution

  • Note   The second example, using END block, is more complete.
    Note   Discussion of how to use the process group for this is at the end.


    Most of the time chances are that you are dealing with SIGTERM signal. For this you can arrange to clean up child processes, via a handler. There are signals that cannot be trapped, notably SIGKILL and SIGSTOP. For those you'd have to go to the OS, per answer by Kaz. Here is a sketch for others. Also see other code below it, comments below that, and process group use at the end.

    use warnings;
    use strict;
    use feature qw(say);
    say "Parent pid: $$";
    
    $SIG{TERM} = \&handle_signal;  # Same for others that can (and should) be handled
    
    END { say "In END block ..." }
    
    my @kids;
    for (1..4) {
        push @kids, fork;
        if ($kids[-1] == 0) { exec 'sleep 20' }
    }
    say "Started processes: @kids";
    sleep 30;   
    
    sub handle_signal { 
        my $signame = shift;
        say "Got $signame. Clean up child processes.";
        clean_up_kids(@kids);
        die "Die-ing for $signame signal";
    };
    
    sub clean_up_kids { 
        say "\tSending TERM to processes @_";
        my $cnt = kill 'TERM', @_;  
        say "\tNumber of processes signaled: $cnt";
        waitpid $_, 0 for @_;  # blocking
    }
    

    When I run this as signals.pl & and then kill it, it prints

    [13] 4974
    Parent pid: 4974
    Started processes: 4978 4979 4980 4982
    prompt> kill 4974
    Got TERM. Clean up child processes.
            Sending TERM to processes 4978 4979 4980 4982
            Number of processes signaled: 4
    Die-ing for TERM signal at signals.pl line 25.
    In END block ...
    
    [13]   Exit 4                        signals.pl
    

    The processes do get killed, checked by ps aux | egrep '[s]leep' before and after kill.

    By courtesy of die the END block gets executed orderly so you can clean up child processes there. That way you are also protected against uncaught die. So you'd use the handler merely to ensure that the END block cleanup happens.

    use POSIX "sys_wait_h";
    $SIG{CHLD} = sub { while (waitpid(-1, WNOHANG) > 0) { } };  # non-blocking
    $SIG{TERM} = \&handle_signal;
    
    END { 
        clean_up_kids(@kids);
        my @live = grep { kill 0, $_ } @kids;
        warn "Processes @live still running" if @live;
    }
    
    sub clean_up_kids { 
        my $cnt = kill 'TERM', @_;
        say "Signaled $cnt processes.";
    }
    sub handle_signal { die "Die-ing for " . shift }
    

    Here we reap (all) terminated child processes in a SIGCHLD handler, see Signals in perlipc and waitpid. We also check in the end whether they are all gone (and reaped).

    The kill 0, $pid returns true even if the child is a zombie (exited but not reaped), and this may happen in tests as the parent checks right after. Add sleep 1 after clean_up_kids() if needed.

    Some notes. This is nowhere near to a full list of things to consider. Along with mentioned (and other) Perl docs also see UNIX and C documentation as Perl's ipc is built directly over UNIX system tools.

    • Practically all error checking is omitted here. Please add

    • Waiting for particular processes is blocking so if some weren't terminated the program will hang. The non-blocking waitpid has another caveat, see linked perlipc docs

    • Child processes may have exited before the parent was killed. The kill 'TERM' sends SIGTERM but this doesn't ensure that the child terminates. Processes may or may not be there

    • Signal handlers may get invalidated in the END phase, see this post. In my tests the CHLD is still handled here but if this is a problem re-install the handler, as in the linked answer

    • There are modules for various aspects of this. See sigtrap pragma for example

    • One is well advised to not do much in signal handlers

    • There is a lot going on and errors can have unexpected and far ranging consequences


    If you kill the process group you won't have any of these issues, since all children are then terminated as well. On my system this can be done at the terminal by

    prompt> kill -s TERM -pid
    

    You may have to use a numeric signal, generally 15 for TERM, see man kill on your system. The -pid stands for the process group, signified by the minus sign. The number is the same as the process ID, but add say getpgrp; to the code to see. If this process has not been simply launched by the shell, but say from another script, it will belong to its parent's process group, and so will its children. Then you need to set its own process group first, which its children will inherit, and then you can kill that process group. See setpgrp and getpgrp.