Search code examples
c++csegmentation-faultgdbcoredump

When using a coredump in gdb how do I know exactly which thread caused SIGSEGV?


My application uses more than 8 threads. When I run info threads in gdb I see the threads and the last function they were executing. It does not seem obvious to me exactly which thread caused the SIGSEGV. Is it possible to tell it? Is it thread 1? How are the threads numbered?


Solution

  • When you use gdb to analyze the core dump file, the gdb will stop at the function which causes program core dump. And the current thread will be the murder. Take the following program as an example:

    #include <stdio.h>
    #include <pthread.h>
    void *thread_func(void *p_arg)
    {
            while (1)
            {
                    printf("%s\n", (char*)p_arg);
                    sleep(10);
            }
    }
    int main(void)
    {
            pthread_t t1, t2;
    
            pthread_create(&t1, NULL, thread_func, "Thread 1");
            pthread_create(&t2, NULL, thread_func, NULL);
    
            sleep(1000);
            return;
    }
    

    The t2 thread will cause program down because it refers a NULL pointer. After the program down, use gdb to analyze the core dump file:

    [root@localhost nan]# gdb -q a core.32794
    Reading symbols from a...done.
    [New LWP 32796]
    [New LWP 32795]
    [New LWP 32794]
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib64/libthread_db.so.1".
    Core was generated by `./a'.
    Program terminated with signal SIGSEGV, Segmentation fault.
    #0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
    (gdb)
    

    The gdb stops at __strlen_sse2 function, this means this function causes the program down. Then use bt command to see it is called by which thread:

    (gdb) bt
    #0  0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
    #1  0x00000034e4268cdb in puts () from /lib64/libc.so.6
    #2  0x00000000004005cc in thread_func (p_arg=0x0) at a.c:7
    #3  0x00000034e4a079d1 in start_thread () from /lib64/libpthread.so.0
    #4  0x00000034e42e8b6d in clone () from /lib64/libc.so.6
    (gdb) i threads
      Id   Target Id         Frame
      3    Thread 0x7ff6104c1700 (LWP 32794) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
      2    Thread 0x7ff6104bf700 (LWP 32795) 0x00000034e42accdd in nanosleep () from /lib64/libc.so.6
    * 1    Thread 0x7ff60fabe700 (LWP 32796) 0x00000034e4281451 in __strlen_sse2 () from /lib64/libc.so.6
    

    The bt command shows the stack frame of the current thread(which is the murder). "i threads" commands shows all the threads, the thread number which begins with * is the current thread.

    As for "How are the threads numbered?", it depends on the OS. you can refer the gdb manual for more information.