Search code examples
cmultithreadingsignalssetjmp

trapping signals in a multithreaded environment


I have a large program that needs to be made as resilient as possible, and has a large number of threads. I need to catch all signals SIGBUS SIGSEGV, and re-initialize the problem thread if necessary, or disable the thread to continue with reduced functionality.

My first thought is to do a setjump, and then set signal handlers, that can log the problem, and then do a longjump back to a recovery point in the thread. There is the issue that the signal handler would need to determine which thread the signal came from, to use the appropriate jump buffer as jumping back to the wrong thread would be useless.

Does anyone have any idea how to determine the offending thread in the signal handler?


Solution

  • Using syscall(SYS_gettid) works for me on my Linux box: gcc pt.c -lpthread -Wall -Wextra

    //pt.c
    #define _GNU_SOURCE
    #include <stdio.h>
    #include <pthread.h>
    #include <unistd.h>
    #include <sys/syscall.h>
    #include <setjmp.h>
    #include <signal.h>
    #include <string.h>
    #include <ucontext.h>
    #include <stdlib.h>
    
    static sigjmp_buf jmpbuf[65536];
    
    static void handler(int sig, siginfo_t *siginfo, void *context)
    {
        //ucontext_t *ucontext = context;
        pid_t tid = syscall(SYS_gettid);
    
        printf("Thread %d in handler, signal %d\n", tid, sig);
        siglongjmp(jmpbuf[tid], 1);
    }
    
    static void *threadfunc(void *data)
    {
        int index, segvindex = *(int *)data;
        pid_t tid = syscall(SYS_gettid);
    
        for(index = 0; index < 500; index++) {
            if (sigsetjmp(jmpbuf[tid], 1) == 1) {
                printf("Recovery of thread %d\n", tid); 
                continue;
            }
            printf("Thread %d, index %d\n", tid, index);
            if (index % 5 == segvindex) {
                printf("%zu\n", strlen((char *)2)); // SIGSEGV
            }
            pthread_yield();
        }
        return NULL;
    }
    
    int main(void)
    {
        pthread_t thread1, thread2, thread3;
        int segvindex1 = rand() % 5;
        int segvindex2 = rand() % 5;
        int segvindex3 = rand() % 5;
        struct sigaction sact;
    
        memset(&sact, 0, sizeof sact);
        sact.sa_sigaction = handler;
        sact.sa_flags = SA_SIGINFO;
        if (sigaction(SIGSEGV, &sact, NULL) < 0) {
            perror("sigaction");
            return 1;
        }
        pthread_create(&thread1, NULL, &threadfunc, (void *) &segvindex1);
        pthread_create(&thread2, NULL, &threadfunc, (void *) &segvindex2);
        pthread_create(&thread3, NULL, &threadfunc, (void *) &segvindex3);
        pthread_join(thread1, NULL);
        pthread_join(thread2, NULL);
        pthread_join(thread3, NULL);
        return 0;
    }
    

    To be more portable pthread_self can be used. It is async-signal-safe.

    But the thread which got the SIGSEGV should start a new thread by async-signal-safe means and should not do a siglongjmp as it could result in the invocation of non-async-signal-safe functions.