Search code examples
cassemblyreverse-engineeringpseudocodehopper

Not understanding Hopper decompiler output


I know some C and a little bit of assembly and wanted to start learning about reverse engineering, so I downloaded the trial of Hopper Disassembler for Mac. I created a super basic C program:

int main() {
    int a = 5;
    return 0;
}

And compiled it with the -g flag (because I saw this before and wasn't sure if it mattered):

gcc -g simple.c

Then I opened the a.out file in Hopper Disassembler and clicked on the Pseudo Code button and it gave me:

int _main() {
    rax = 0x0;
    var_4 = 0x0;
    var_8 = 0x5;
    rsp = rsp + 0x8;
    rbp = stack[2047];
    return 0x0;
}

The only line I sort of understand here is setting a variable to 0x5. I'm unable to comprehend what all these additional lines are for (such as the rsp = rsp + 0x8;), for such a simple program. Would anyone be willing to explain this to me?

Also if anyone knows of good sources/tutorials for an intro into reverse engineering that'd be very helpful as well. Thanks.


Solution

  • Looks like it is doing a particularly poor job of producing "disassembly pseudocode" (whatever that is -- is it a disassembler or a decompliler? Can't decide)

    In this case it looks like it has has elided the stack frame setup (the function prolog), but not the cleanup (function epilog). So you'll get a much better idea of what is going on by using an actual disassembler to look at the actual disassembly code:

    $ gcc -c simple.c
    $ objdump -d simple.o
    
    simple.o:     file format elf64-x86-64
    
    Disassembly of section .text:
    
    0000000000000000 <main>:
       0:   55                      push   %rbp
       1:   48 89 e5                mov    %rsp,%rbp
       4:   c7 45 fc 05 00 00 00    movl   $0x5,-0x4(%rbp)
       b:   b8 00 00 00 00          mov    $0x0,%eax
      10:   5d                      pop    %rbp
      11:   c3                      retq   
    

    So what we have here is code to set up a stack frame (address 0-1), the assignment you have (4), setting up the return value (b), tearing down the frame (10) and then returning (11). You might see something different due to using a different version of gcc or a different target.

    In the case of your disassembly, the first part has been elided (left out as being an uninteresting housekeeping task) by the disassembler, but the second to last part (which undoes the first part) has not.