Logging Signals in Python

I have a single-threaded Python application. The standard way in which this application gets shut down is by sending it SIGINT and letting the various with: and try: finally: blocks handle a safe and graceful shutdown. However, this results in a not-particularly-readable logfile, since you simply start seeing the messages from the handlers with no clear indication of what's going on or why its shutting down.

I tried to solve this by adding a simple signal handler that would log the received signal before raising KeyboardInterrupt, like so:

def log_and_exit_handler(signum, stack):
    logger.info(f"Terminating due to signal {_signal.Signals(signum)}", stack_info=True)
    raise KeyboardInterrupt()
_signal.signal(_signal.SIGINT, log_and_exit_handler)

However, while testing it, I got a logging error:

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1029, in emit
    self.flush()
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1009, in flush
    self.stream.flush()
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>
Call stack:
  [REDACTED]
  File "[REDACTED]", line 485, in _save_checkpoint
    _logger.info(f"Checkpoint saved")
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1378, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1514, in _log
    self.handle(record)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1524, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1586, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 894, in handle
    self.emit(record)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1029, in emit
    self.flush()
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1009, in flush
    self.stream.flush()
  File "[REDACTED]", line 203, in log_and_exit_handler
    logger.info("Terminating due to signal {signal_}".format(signal_=signal_), stack_info=True)
Message: 'Terminating due to signal 2'
Arguments: ()

Apparently, the app was already in the middle of outputting a log message when the signal was received, and the logging module is not re-entrant.

Are there any workarounds or alternate approaches I can use to safely accomplish my goal of logging a signal when it is received?

Solution

So, I finally found a solution: block the offending signals whenever we're writing a log message to STDERR using signal.pthread_sigmask(), then unblock them after. Signals are blocked are not lost, and will be processed as soon as the block is removed.

class SignalSafeStreamHandler(logging.StreamHandler):
    """Stream handler that blocks signals while emitting a record.

    The locks for writing to streams like STDERR are not reentrant,
    resulting in a potential logging failure if, while in the process of writing a log message,
    we process a signal and the signal handler tries to log something itself.

    This class keeps a list of signals which have handlers that might try to log anything,
    and blocks those signals for the duration of the write operation.
    After the write op is done, it unblocks the signals, allowing any that arrived during the duration to be processed.
    """

    def __init__(self, stream=None):
        super().__init__(stream=stream)

        self._signals_to_block: _tp.Set[_signal.Signals] = set()

    @property
    def signals_to_block(self) -> _tp.Set[_signal.Signals]:
        """The set of signals to block while writing to the stream"""
        return set(self._signals_to_block)

    def add_blocked_signal(self, signal: _signal.Signals) -> None:
        """Add the given signal to the list of signals to block"""
        self._signals_to_block.add(signal)

    def emit(self, record: logging.LogRecord) -> None:
        """Emit the given record; if this is the main thread, block the set signals while doing so"""

        # Only the main thread is subject to signal handling; other threads can operate normally
        if _threading.current_thread() is not _threading.main_thread():
            super().emit(record)
            return

        # We block any signals which have handlers that want to do logging while we write.
        old_blocked_signals: _tp.Set[_signal.Signals] = _signal.pthread_sigmask(
            _signal.SIG_BLOCK,
            self._signals_to_block,
        )
        try:
            # Delegate to the parent method to do the actual logging
            super().emit(record)
        finally:
            # Now we unblock the signals
            # (or, more precisely, restore the old set of blocked signals;
            # if one of those signals was already blocked when we started, it will stay that way).
            #
            # Any signals of those types which were received in the duration
            # will immediately be processed and their handlers run.
            _signal.pthread_sigmask(_signal.SIG_SETMASK, old_blocked_signals)

_root_handler = SignalSafeStreamHandler()
logging.root.addHandler(_root_handler)

def log_and_exit_handler(signum, stack):
    logger.info(f"Terminating due to signal {_signal.Signals(signum)}", stack_info=True)
    raise KeyboardInterrupt()

_signal.signal(_signal.SIGINT, log_and_exit_handler)
_root_handler.add_blocked_signal(signal)

This solution could be extended to other logging handlers besides StreamHandler, e.g. if you're logging to JournalD or to a DB. However, be cautious using it for any handler where the write operation could potentially take a non-trivial amount of time to complete. For example, if you're logging to a DB and the DB goes down, the write operation would freeze until the DB request times out, and you'd be unable to interrupt it earlier because SIGINT is masked for the duration. On possible solution would be to use a seperate thread to do the logging; only the main thread is subject to interruption by a signal handler.

Limitation: Signal masking is only available on Linux, not Windows.