Search code examples
pythondebuggingassemblyreverse-engineeringself-modifying

Debugger and cpu emulator don't detect self-modified code


Problem:

I made an elf executable that self modifies one of its byte. It simply changes a 0 for a 1. When I run the executable normally, I can see that the change was successful because it runs exactly as expected (more on that further down). The problem arises when debugging it: The debugger (using radare2) returns the wrong value when looking at the modified byte.

Context:

I made a reverse engineering challenge, inspired by Smallest elf. You can see the "source code" (if you can even call it that) there: https://pastebin.com/Yr1nFX8W.

To assemble and execute:

nasm -f bin -o tinyelf tinyelf.asm
chmod +x tinyelf
./tinyelf [flag]

If the flag is right, it returns 0. Any other value means your answer is wrong.

./tinyelf FLAG{wrong-flag}; echo $?

... outputs "255".

!Solution SPOILERS!

It's possible to reverse it statically. Once that is done, you will find out that each characters in the flag is found by doing this calculation:

flag[i] = b[i] + b[i+32] + b[i+64] + b[i+96];

...where i is the index of the character, and b is the bytes of the executable itself. Here is a c script that solve the challenge without a debugger:

#include <stdio.h>

int main()
{
    char buffer[128];
    FILE* fp;

    fp = fopen("tinyelf", "r");
    fread(buffer, 128, 1, fp);

    int i;
    char c = 0;
    for (i = 0; i < 32; i++) {
        c = buffer[i];

        // handle self-modifying code
        if (i == 10) {
            c = 0;
        }

        c += buffer[i+32] + buffer[i+64] + buffer[i+96];
        printf("%c", c);
    }
    printf("\n");
}

You can see that my solver handles a special case: When i == 10, c = 0. That's because it's the index of the byte that is modified during execution. Running the solver and calling tinyelf with it I get:

FLAG{Wh3n0ptiMizaTioNGOesT00F4r}
./tinyelf FLAG{Wh3n0ptiMizaTioNGOesT00F4r} ; echo $?

Output: 0. Success!

Great, let's try to solve it dynamically now, using python and radare2:

import r2pipe

r2 = r2pipe.open('./tinyelf')

r2.cmd('doo FLAG{AAAAAAAAAAAAAAAAAAAAAAAAAA}')
r2.cmd('db 0x01002051')

flag = ''
for i in range(0, 32):
    r2.cmd('dc')
    eax = r2.cmd('dr? al')
    c = int(eax, 16)
    flag += chr(c)

print('\n\n' + flag)

It puts a breakpoint on the command that compares the input characters with the expected characters, then get what it is compared with (al). This SHOULD work. Yet, here is the output:

FLAG{Wh3n0�tiMiza�ioNGOesT00F4r}

2 incorrect values, one of which is at the index 10 (the modified byte). Weird, maybe a bug with radare2? Let's try unicorn (a cpu emulator) next:

from unicorn import *
from unicorn.x86_const import *
from pwn import *

ADDRESS = 0x01002000

mu = Uc(UC_ARCH_X86, UC_MODE_32)
code = bytearray(open('./tinyelf').read())

mu.mem_map(ADDRESS, 20 * 1024 * 1024)

mu.mem_write(ADDRESS, str(code))

mu.reg_write(UC_X86_REG_ESP, ADDRESS + 0x2000)
mu.reg_write(UC_X86_REG_EBP, ADDRESS + 0x2000)

mu.mem_write(ADDRESS + 0x2000, p32(2)) # argc
mu.mem_write(ADDRESS + 0x2000 + 4, p32(ADDRESS + 0x5000)) # argv[0]
mu.mem_write(ADDRESS + 0x2000 + 8, p32(ADDRESS + 0x5000)) # argv[1]
mu.mem_write(ADDRESS + 0x5000, "x" * 32)

flag = ''

def hook_code(uc, address, size, user_data):
    global flag
    eip = uc.reg_read(UC_X86_REG_EIP)

    if eip == 0x01002051:
        c = uc.reg_read(UC_X86_REG_EAX) & 0x7f
        #print(str(c) + " " + chr(c))
        flag += chr(c)

mu.hook_add(UC_HOOK_CODE, hook_code)

try:
    mu.emu_start(0x01002004, ADDRESS + len(code))
except Exception:
    print flag

This time the solver outputs: FLAG{Wh3n0otiMizaTioNGOesT00F4r}

Notice at the index 10: 'o' instead of 'p'. That's an off by 1 mistake exactly where the byte is modified. That can't be a coincidence, right?

Anyone has an idea why both these scripts do not work? Thank you.


Solution

  • There is no issue with radare2 but your analysis of the program is incorrect thus the code that you wrote handles this RE incorrectly.

    Lets start with

    When i == 10, c = 0. That's because it's the index of the byte that is modified during execution.

    That is partially true. It is set to zero at the beginning but then after each round there is this code:

    xor al, byte [esi]                               
    or byte [ebx + 0xa], al
    

    So let's understand what's happening here. al is the currently calculated char of the flag and esi points to the FLAG that was entered as a argument and at [ebx + 0xa] we currently have 0 (set at the beginning), so the char at index 0xa will stay zero only if the calculated flag char is equal to the one in argument and since you are running r2 with a fake flag, that starts to be a problem from 6th char but the result of this you see at the first � at index 10. To mitigate that we need to update your script a little bit.

    eax = r2.cmd('dr? al')
    c = int(eax, 16)
    r2.cmd("ds 2")
    r2.cmd("dr al = 0x0")
    

    What we do here is that after the brekpoint was hit and we read the calculated flag char we move two instructions further (to reach 0x01002054) and then we set al to 0x0 to emulate that our char at [esi] was actually the same as the calculated one (so xor will return 0 in such case). By doing this we keep value at 0xa to be zero still.

    Now the second character. This RE is tricky ;) - it reads itself and if you forget about that you might end up with case like this. Let's try to analyze why this character is off. It is 18th character of the flag (so index is 17 as we start from 0) and if we check the formula for characters indexes that we read from the binary we noticed that indexes are: 17(dec) = 11(hex), 17 + 32 = 49(dec) = 31(hex), 17 + 64 = 81(dec) = 51(hex), 17 + 96 = 113(dec) = 71(hex). But this 51(hex) looks oddly familiar? Didn't we see that somewhere before? Yup, it's the offset at which you set your breakpoint to read the al value.

    This is the code that break your second char

    r2.cmd('db 0x01002051')
    

    Yup - your breakpoint. You are setting to break at that address and a soft breakpoint is putting a 0xcc in the memory address so when the opcode that reads 3rd byte of the 18th char hits that spot it does not get 0x5b (the original value) it gets 0xcc. So to fix that we need to correct that calculation. Here probably it can be done in a smarter/more elegant way but I went for a simple solution so I just did this:

    if i == 17:
      c -= (0xcc-0x5b)
    

    Just subtract was was unintentionally added by putting a breakpoint in the code.

    The final code:

    import r2pipe
    
    r2 = r2pipe.open('./tinyelf')
    print r2
    
    r2.cmd("doo FLAG{AAAAAAAAAAAAAAAAAAAAAAAAAA}")
    r2.cmd("db 0x01002051")
    
    flag = ''
    for i in range(0, 32):
      r2.cmd("dc")
      eax = r2.cmd('dr? al')
      c = int(eax, 16)   
      if i == 17:
        c -= (0xcc-0x5b)
      r2.cmd("ds 2")
      r2.cmd("dr al = 0x0")
      flag += chr(c)
    
    print('\n\n' + flag)
    

    That prints the correct flag:

    FLAG{Wh3n0ptiMizaTioNGOesT00F4r}

    As for the Unicorn you are not setting the breakpoint so the problem 2 goes away, and the off-by-1 on 10th index is due to the same reason as for r2.