Search code examples
linuxdebugginggdbdivide-by-zero

How to translate kernel's trap divide error rsp:2b6d2ea40450 to a source location?


Customer reported an error in one of our programs caused by division by zero. We have only this VLM line:

kernel: myprog[16122] trap divide error rip:79dd99 rsp:2b6d2ea40450 error:0 

I do not believe there is core file for that.

I searched through the Internet to find how I can tell the line of the program that caused this division by zero, but so far I am failing.

I understand that 16122 is pid of the program, so that will not help me.

I suspect that rsp:2b6d2ea40450 has something to do with the address of the line that caused the error (0x2b6d2ea40450) but is that true?

If it is then how can I translate it to a physical approximate location in the source assuming I can load debug version of myprog into gdb, and then request to show the context around this address...

Any, any help will be greatly appreciated!


Solution

  • rip is the instruction pointer, rsp is the stack pointer. The stack pointer is not too useful unless you have a core image or a running process.

    You can use either addr2line or the disassemble command in gdb to see the line that got the error, based on the ip.

    $ cat divtest.c
    main()
    {
        int a, b;
    
        a = 1; b = a/0;
    }
    
    $ ./divtest
    Floating point exception (core dumped)
    $ dmesg|tail -1
    [ 6827.463256] traps: divtest[3255] trap divide error ip:400504 sp:7fff54e81330
        error:0 in divtest[400000+1000]
    
    $ addr2line -e divtest 400504
    ./divtest.c:5
    
    $ gdb divtest
    (gdb) disass /m 0x400504
    Dump of assembler code for function main:
    2       {
       0x00000000004004f0 :     push   %rbp
       0x00000000004004f1 :     mov    %rsp,%rbp
    
    3               int a, b;
    4       
    5               a = 1; b = a/0;
       0x00000000004004f4 :     movl   $0x1,-0x4(%rbp)
       0x00000000004004fb :    mov    -0x4(%rbp),%eax
       0x00000000004004fe :    mov    $0x0,%ecx
       0x0000000000400503 :    cltd   
       0x0000000000400504 :    idiv   %ecx
       0x0000000000400506 :    mov    %eax,-0x8(%rbp)
    
    6       }
       0x0000000000400509 :    pop    %rbp
       0x000000000040050a :    retq   
    
    End of assembler dump.