Search code examples
parsingstreamingelfcrash-dumps

Is the Executable and Linkable Format (ELF) streamable?


I'd like to extract the stacktrace from crashing applications with large memory footprints. Ideally, the user wouldn't need to wait while the entire coredump is written to disk.

My current thinking is to install a coredump hook at /proc/sys/kernel/core_pattern which would parse the incoming coredump via stdin and extract just the stacktrace. But, creating a complete copy of the coredump in memory would be impractical, so a streaming approach would be better.

I'm new to the ELF format (http://en.wikipedia.org/wiki/Executable_and_Linkable_Format) and was wondering if it might support a streaming parser. I haven't written a streaming parser of any kind yet - I'm familiar with the concept but need pointers on how to analyze a format for stream-ability.

As a first attempt, I tried:

cat core | readelf -a

But, it doesn't seem like readelf supports input from stdin.

I also found this python elf parser, but it appears at first glance like it reads the entire elf into memory: https://github.com/eliben/pyelftools

But, if needed, maybe I could use their implementation as reference for a streaming parser.

Thanks a bunch!


Solution

  • It turns out that Google's coredumper documents the ELF core file format: https://code.google.com/p/google-coredumper/source/browse/trunk/src/elfcore.c

    This code snippet was also helpful: http://emntech.blogspot.com/2012/08/printing-backtracestack-trace-using.html

    It appears that the stacktrace is contained in a single segment of the elf. The solution then is to:

    1. Read the elf headers
    2. Find a note entry of type NT_PRSTATUS
    3. Get the top of stack address from a register within this entry
    4. Go to that address
    5. Read the stacktrace
    6. Ignore the rest of the coredump

    I still have some work to do in terms of resolving symbols, etc. But, I'll edit this answer if my approach changes significantly. While whether the format can be 'streamed' was not really the right question to ask, I did find a solution which allowed me to read the stacktrace without writing the entire coredump to disk.

    EDIT:

    According to the answers to How gdb reconstructs stacktrace for C++?, it seems that reconstructing the stack in all cases is quite complex. I believe then the final answer to this question is, no, it's not possible to extract the stack from ELF Cores and an ELF Core is not "streamable."

    I believe though there is a chance the heap could be located in a coredump and removed. This would leave the stack intact, allowing gdb to still reconstruct it.