Search code examples
carchitecturesignalsmultiprocesssystems-programming

Exception handling through signal communication in multiprocess programs in C


We're currently grappling with an issue in a multiprocess program that involves the use of signals to safely terminate its child processes.

In our setup, a parent process sends a signal to its child processes to terminate and free all resources when it encounters a critical error (e.g., failure in malloc(), pipe(), or fork()). This strategy is designed to prevent unpredictable behavior due to the interdependencies among the child processes.

The challenge arises when we experience a sporadic double free within our main struct, which is accessible across most of our functions. Here's the situation:

Suppose a child process is in the midst of freeing a variable from the main struct (let's say an array). The process of iterating through the array and freeing each element is of course not instantaneous. If the function that frees the array is midway through this operation and the process receives a signal (which triggers the signal handler to free the main struct and terminate the process), we end up with a double free for the first half of the array that has already been freed.

We already found out that free() is not an async-signal-safe function and should not be called in signal handler functions.

This brings us to our central query:

What is a good heuristic to follow to define the timing for checking exception handling flags set by signal handling functions?

What are safe and elegant strategies to tackle this issue?

Thanks in advance for your input. Every response is valuable!

We considered two possible solutions:

  1. Limitting signal handler functions to merely setting a global flag, which then needs to be checked frequently during the regular program flow to determine if an abort is necessary. However, this does not seem very elegant.

    • What would be a good timing to check the state of this flag?
      • Before every request of more ressources from the OS (malloc(), pipe(), fork())?
      • Before returning from any function?
  2. Setting each pointer to NULL immediately after it has been freed.

    • Is there a risk that a signal might arrive right after a pointer has been freed but before it's been set to NULL?

Solution

  • A strategy could be to set the members that are freed to NULL just before calling free:

    for(...){
        void *mem = arr[i];
        arr[i] = NULL;
        free(mem);
    } 
    

    It's safe to call free on the NULL values when (half of) the array was freed already: nothing will happen.

    Now the only potential problem would be that the process might be interrupted between setting the member to NULL and calling free, so that one member won't be freed.

    However, as the comments noted, if the program is terminating anyway, it should not be necessary to call free at all. The OS will clean up all memory used by the process anyway. In fact, it may even be harmful.