Search code examples
c++clinuxgdbobjdump

Correlate Source with Assembly Listing of a C++ Program


Analyzing Core Dump in retail build often requires to correlate the objdump of any specific module and the source. Normally correlating the assembly dump with the source becomes a pain if the function is quite involved. Today I tried to create an assembly listing of one particular module (with the compile option -S) expecting I would see an interleaving source with assembly or some correlation. Unfortunately the listing was not friendly enough to correlate so I was wondering

  • Given a core-dump from which I can determine the crash location
  • objdump of the failing module Assembly Listing by recompiling the
  • module with -S option.

Is it possible to do a one-to-one correspondence with the source?

As an example I see the assembly listing as

.LBE7923:
        .loc 2 4863 0
        movq    %rdi, %r14
        movl    %esi, %r12d
        movl    696(%rsp), %r15d
        movq    704(%rsp), %rbp
.LBB7924:
        .loc 2 4880 0
        testq   %rdx, %rdx
        je      .L2680
.LVL2123:
        testl   %ecx, %ecx
        jle     .L2680
        movslq  %ecx,%rax
        .loc 2 4882 0
        testl   %r15d, %r15d
        .loc 2 4880 0
        leaq    (%rax,%rax,4), %rax
        leaq    -40(%rdx,%rax,8), %rdx
        movq    %rdx, 64(%rsp)

but could not understand how to interpret the labels like .LVL2123 and directives like .loc 2 4863 0

Note As the answers depicted, reading through the assembly source and intuitively determining pattern based on symbols (like function calls, branches, return statement) is what I generally do. I am not denying that it doesn't work but when a function is quite involved, reading though pages of Assembly Listing is a pain and often you end up with listing that seldom match either because of functions getting in-lined or optimizer have simply tossed the code as it pleased. I have a feeling seeing how efficiently Valgrind handles optimized binaries and how in Windows WinDBG can handled optimized binaries, there is something I am missing. So I though I would start with the compiler output and use it to correlate. If my compiler is responsible for mangling the binary it would be the best person to say how to correlate with the source, but unfortunately that was least helpful and the .loc is really misleading. Unfortunately I often have to read through unreproducible dumps across various platforms and the least time I spend is in debugging Windows Mini-dumps though WinDBG and considerable time in debugging Linux Coredumps. I though that may be I am not doing things correctly so I came up with this question.


Solution

  • Is it possible to do a one-to-one correspondence with the source?

    A: no, unless all optimisation is disabled. The compiler may emit some group of instructions (or instruction-like things) per line initially, but the optimiser then reorders, splits, fuses and generally changes them completely.


    If I'm disassembling release code, I look at the instructions which should have a clear logical relationship to the code. Eg,

    .LBB7924:
            .loc 2 4880 0
            testq   %rdx, %rdx
            je      .L2680
    

    looks like a branch if %rdx is zero, and it comes from line 4880. Find the line, identify the variable being tested, make a note that it's currently assigned to %rdx.

    .LVL2123:
            testl   %ecx, %ecx
            jle     .L2680
    

    OK, so this test and branch has the same target, so whatever comes next knows %rdx and %ecx are both nonzero. The original code might be structured like:

    if (a && b) {
    

    or perhaps it was:

    if (!a || !b) {
    

    and the optimiser reordered the two branches ...

    Now you've got some structure you can hopefully match to the original code, you can also figure out the register assignments. Eg, if you know the thing being tested is the data member of some structure, read backwards to see where %rdx was loaded from memory: was it loaded from a fixed offset to some other register? If so, that register is probably the object address.

    Good luck!