Search code examples
cassemblyx86-64reverse-engineering

How To Translate Assembly Code Back Into C?


I saw the following assembly code (note I added the notes by myself):

00000000004005f0 <check_password>:
  4005f0:   31 c0                   xor    %eax,%eax          # i=0
  4005f2:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)   # ignore
  4005f8:   0f b6 90 c0 1b 40 00    movzbl 0x401bc0(%rax),%edx# edx=foo[i]
  4005ff:   83 f2 5f                xor    $0x5f,%edx
  400602:   38 14 07                cmp    %dl,(%rdi,%rax,1)  # cmp prev result with s[i]
  400605:   75 14                   jne    40061b <check_password+0x2b> # failed return 0
  400607:   48 83 c0 01             add    $0x1,%rax          # i++
  40060b:   48 83 f8 0a             cmp    $0xa,%rax          # if (i==10)
  40060f:   75 e7                   jne    4005f8 <check_password+0x8> # run again
  400611:   31 c0                   xor    %eax,%eax
  400613:   80 7f 0a 00             cmpb   $0x0,0xa(%rdi) 
  400617:   0f 94 c0                sete   %al
  40061a:   c3                      retq   
  40061b:   31 c0                   xor    %eax,%eax
  40061d:   c3                      retq    

and want to write the equivalent C code to fill this:

Edit1:

int check_password (char *s)
{
    for (int i=0; i!=10 ; i++)
    {
        if (foo[i] ^ 0x5f != s[i])
            return 0;
    }
    return 0==s[10]; // Isn't this strange? input is of size 10...
}

Edit 2:

int check_password (char *s)
{
    for (int i=0; i!=10 ; i++)
    {
        if (foo[i] ^ 0x5f != s[i])
            return 0==s[10]; // Isn't this strange? input is of size 10...
    }
    return 1;
}

I feel too close to get it but got stuck a little, is version 1 right or 2 or none of them. Plus, what should be the correct translation here?

Some important things to know:

I have the following array of chars called foo which is located at address 401bc0 in memory.

char foo[10] = { 0x33, 0x3d, ...... }; // Not all values are shown

Solution

  • Did you look at actual compiler output to see if GCC reconstructs the same asm with either of your versions? https://godbolt.org/z/ed86jG9rf shows both the almost-right and the wrong versions. (I'm intentionally not saying here which one is right, so you have to go look for yourself, although the difference seems pretty obvious to me between having the cmp/sete in the fall-through path or in the mismatch path. Also, do you see any asm that would unconditionally returns 1? How about 0?)

    Notice that one of them compiles back to your asm with GCC5.4, after you fix your operator-precedence bug: foo[i] ^ 0x5f != s[i] means foo[i] ^ (0x5f != s[i]), i.e. XOR with a bool, so the if would be true/false based on the low bit of the XOR result. You actually (pass[i] ^ 0xf5) != s[i] to compare the XOR result.

    I only noticed this from the asm being weird (with cmp / setne inside the loop), so I turned on -Wall and GCC told me what was going on.

    <source>:19:28: warning: suggest parentheses around comparison in operand of '^' [-Wparentheses]
             if (pass[i] ^ 0x5f != s[i])
                                ^
    

    We have very good tools for compiling C; take advantage of them when writing C for any reason, including reverse engineering from asm. Especially for code you know was compiler-generated, especially if you can guess the compiler. x86-64 with the System V calling convention (arg in RDI) is usually GCC or clang. In this case they both make quite similar asm, if you use -fno-unroll-loops with clang.


    return 0==s[10]; // Isn't this strange? input is of size 10...

    Seems normal (but inefficient) to me: apparently the static const char pass[] doesn't end with a 0 ^ 0x5f terminator. Otherwise they could just let the loop run for one more iteration to check that the input implicit-length C string function arg has a 0 terminator after its 10 matching bytes.

    This final check will reject passwords that have trailing garbage after the first 10 bytes, but matched up to that point.

    So that means it's expecting strlen(s) == 10, i.e. a char [11] to hold 10 bytes plus a terminating 0.