Search code examples
c++linuxcoredump

How to debug a segmentation fault while the gdb stack trace is full of '??'?


My executable contains symbol table. But it seems that the stack trace is overwrited.

How to get more information out of that core please? For instance, is there a way to inspect the heap ? See the objects instances populating the heap to get some clues. Whatever, any idea is appreciated.


Solution

  • I am a C++ programmer for a living and I have encountered this issue more times than i like to admit. Your application is smashing HUGE part of the stack. Chances are the function that is corrupting the stack is also crashing on return. The reason why is because the return address has been overwritten, and this is why GDB's stack trace is messed up.

    This is how I debug this issue:

    1)Step though the application until it crashes. (Look for a function that is crashing on return).

    2)Once you have identified the function, declare a variable at the VERY FIRST LINE of the function:

    int canary=0;
    

    (The reason why it must be the first line is that this value must be at the very top of the stack. This "canary" will be overwritten before the function's return address.)

    3) Put a variable watch on canary, step though the function and when canary!=0, then you have found your buffer overflow! Another possibility it to put a variable breakpoint for when canary!=0 and just run the program normally, this is a little easier but not all IDE's support variable breakpoints.

    EDIT: I have talked to a senior programmer at my office and in order to understand the core dump you need to resolve the memory addresses it has. One way to figure out these addresses is to look at the MAP file for the binary, which is human readable. Here is an example of generating a MAP file using gcc:

    gcc -o foo -Wl,-Map,foo.map foo.c
    

    This is a piece of the puzzle, but it will still be very difficult to obtain the address of function that is crashing. If you are running this application on a modern platform then ASLR will probably make the addresses in the core dump useless. Some implementation of ASLR will randomize the function addresses of your binary which makes the core dump absolutely worthless.