Search code examples
clinuxpthreadsposixcancellation

What happens if a signal handler is invoked while at a cancellation point?


Suppose an application is blocked at a cancellation point, for example read, and a signal is received and a signal handler invoked. Glibc/NPTL implements cancellation points by enabling asynchronous cancellation for the duration of the syscall, so as far as I can tell, asynchronous cancellation will remain in effect for the entire duration of the signal handler. This would of course be horribly wrong, as there are plenty of functions that are not async-cancel-safe but which are required to be safe to call from signal handlers.

This leaves me with two questions:

  • Am I wrong or is the glibc/NPTL behavior really this dangerously broken? If so, is such dangerous behavior conformant?
  • What, according to POSIX, is supposed to happen if a signal handler is invoked while the process is executing a function which is a cancellation point?

Edit: I've almost convinced myself that any thread which is a potential target of pthread_cancel must ensure that functions which are cancellation points can never be called from a signal handler in that thread's context:

On the one hand, any signal handler that can be invoked in a thread that might be cancelled and which uses any async-cancel-unsafe functions must disable cancellation before calling any function which is a cancellation point. This is because, from the perspective of the code interrupted by the signal, any such cancellation would be equivalent to asynchronous cancellation. On the other hand, a signal handler cannot disable cancellation, unless the code that will be running when the signal handler is invoked only uses async-signal-safe functions, because pthread_setcancelstate is not async-signal-safe.


Solution

  • To answer the first half of my own question: glibc does exhibit the behavior I predicted. Signal handlers that run while blocked at a cancellation point run under asynchronous cancellation. To see this effect, simply create a thread that invokes a cancellation point that will block forever (or for a long time), wait a moment, send it a signal, wait a moment again, and cancel and join it. The signal handler should fiddle with some volatile variables in a way that makes it clear that it ran for an unpredictable amount of time before being terminated asynchronously.

    As for whether POSIX allows this behavior, I'm still not 100% certain. POSIX states:

    Whenever a thread has cancelability enabled and a cancellation request has been made with that thread as the target, and the thread then calls any function that is a cancellation point (such as pthread_testcancel() or read()), the cancellation request shall be acted upon before the function returns. If a thread has cancelability enabled and a cancellation request is made with the thread as a target while the thread is suspended at a cancellation point, the thread shall be awakened and the cancellation request shall be acted upon. It is unspecified whether the cancellation request is acted upon or whether the cancellation request remains pending and the thread resumes normal execution if:

    • The thread is suspended at a cancellation point and the event for which it is waiting occurs

    • A specified timeout expired

    before the cancellation request is acted upon.

    Presumably executing a signal handler is not a case of being "suspended", so I'm leaning towards interpreting glibc's behavior here as non-conformant.