Search code examples
cpthreadssignalsposixsigaction

POSIX signal being blocked in signal handler despite not being in sa_mask


I posted a similar question yesterday but I did a poor job of outlining my problem and since then I think I have made progress.

My minimal working example is still quite long so I will post relevant snippets but the full example can be found here.

My problem is quite simple, I have two POSIX message queues that are created to be asynchronous and are both handled by the same handler on the same thread. My problem is at a more fundamental level in that if a separate thread sends to both queues sequentially then the sig handler is only run once for the first queue. This makes sense given that when a signal invokes a handler it is automatically blocked, according to GNU.

As such when I am configuring my struct sigaction I made sure to remove the target signal (SIGIO) from the sigset_t that I set as sa_mask. My assumption was that then using SA_NODEFER, as explained in sigaction(2), the signal handler would be able to be called recursively (not sure if recursively is the right word here).

sa_mask specifies a mask of signals which should be blocked (i.e., added to the signal mask of the thread in which the signal handler is invoked) during execution of the signal handler. In addition, the signal which triggered the handler will be blocked, unless the SA_NODEFER flag is used.

The relevant code for attaching the signal handler to the message queue

assert((conn->fd = mq_open(conn->name, O_CREAT | O_RDONLY | O_NONBLOCK,
               0644, &attr)));

/** Setup handler for SIGIO */
/** sigaction(2) specifies that the triggering signal is blocked in the handler  */
/**     unless SA_NODEFER is specified */
sa.sa_flags = SA_SIGINFO | SA_RESTART | SA_NODEFER; 
sa.sa_sigaction = sigHandler;
/** sa_mask specifies signals that will be blocked in the thread the signal  */
/**     handler executes in */
sigfillset(&sa.sa_mask);
sigdelset(&sa.sa_mask, SIGIO);
if (sigaction(SIGIO, &sa, NULL)) {
    printf("Sigaction failed\n");
    goto error;
}

printf("Handler set in PID: %d for TID: %d\n", getpid(), gettid());

/** fcntl(2) - FN_SETOWN_EX is used to target SIGIO and SIGURG signals to a  */
/**     particular thread */
struct f_owner_ex cur_tid = { .type = F_OWNER_TID, .pid = gettid() };
assert(-1 != fcntl(conn->fd, F_SETOWN_EX, &cur_tid));

As a sanity check I checked the signal mask inside the handler to check if SIGIO was blocked.

void sigHandler(int signal, siginfo_t *info, void *context)
{
    sigset_t sigs;
    sigemptyset(&sigs);
    pthread_sigmask(0, NULL, &sigs);
    if (sigismember(&sigs, SIGIO)) {
        printf("SIGIO being blocked in handler\n");
        sigaddset(&sigs, SIGIO);
        pthread_sigmask(SIG_UNBLOCK, &sigs, NULL);
    }
...
}

But SIGIO appear to not be blocked. My reasoning tells me that the following should happen given the two message queues MQ1 and MQ2 who async use the same handler both on SIGIO. Given the timing of the two threads and the latency of the signals is hard for me to really know. Better said my some-what educated guess would be:

  • mq_send to MQ1 directly followed by mq_send to MQ2 from thread 1
  • MQ1's signal handler should fire given the SIGIO from MQ1 on thread 2
  • MQ2's signal handler would interrupt MQ1's signal handler on thread 2
  • MQ2's signal handler completes on thread 2
  • MQ1's signal handler completes on thread 2

Running the example I linked earlier the following bahaviour is observed

  • mq_send to MQ1 directly followed by mq_send to MQ2 from thread 1
  • MQ1's signal handler fires and completes

Which makes me think that somehow SIGIO is being blocked or ignored during the signal handler. Given what I have read of sa_mask and my sanity check using pthread_sigmask I am not sure why I am getting the behavior I am seeing. I am hoping I have missed some little nugget of knowledge somewhere in the manpages.


Solution

  • My problem is at a more fundamental level in that if a separate thread sends to both queues sequentially then the sig handler is only run once for the first queue... Which makes me think that somehow SIGIO is being blocked or ignored during the signal handler.

    SIGIO is a standard signal, not real-time one. From POSIX Signal Concepts:

    During the time between the generation of a signal and its delivery or acceptance, the signal is said to be "pending". Ordinarily, this interval cannot be detected by an application.

    ...

    If a subsequent occurrence of a pending signal is generated, it is implementation-defined as to whether the signal is delivered or accepted more than once in circumstances other than those in which queuing is required. The order in which multiple, simultaneously pending signals outside the range SIGRTMIN to SIGRTMAX are delivered to or accepted by a process is unspecified.

    On Linux, standard signals do not queue, rather they get dropped when one is already pending. From Linux man signal(7):

    Queueing and delivery semantics for standard signals

    If multiple standard signals are pending for a process, the order in which the signals are delivered is unspecified.

    Standard signals do not queue. If multiple instances of a standard signal are generated while that signal is blocked, then only one instance of the signal is marked as pending (and the signal will be delivered just once when it is unblocked). In the case where a standard signal is already pending, the siginfo_t structure (see sigaction(2)) associated with that signal is not overwritten on arrival of subsequent instances of the same signal. Thus, the process will receive the information associated with the first instance of the signal.


    One work-around for your problem would be to use SIGEV_THREAD notifications instead of SIGEV_SIGNAL, so that your callback is called by another thread. That also removes the limitation of signal handlers where you can only call async-signal-safe functions.