Search code examples
cassemblyx86buffer-overflow

Why does scanf() appear to load an address lower than that of the buffer I am writing to?


I wrote a C program with an intentional buffer overflow for a class assignment. In my program, I have a main function to accept a name from the user as a character array of length 50. The name is then passed as a character array of length 50, where the message "Hello, user!" is printed. User is replaced with the name provided by the user. I do not do any length checking on the scanf() function, but instead get input until a new line character is encountered. As a result, I am able to overrun the buffer, overwrite the return address of main and cause a segmentation fault.

When I disassemble main, using the GDB commands, I am able to see that the address [ebp - 0x3a] is loaded and pushed into the stack to be used as an argument for the scanf function (see photo below). I assumed that this is the start of the buffer, until I converted 0x3a to decimal and found out its value was 58. Why would there be an additional 8 bytes allotted to the character buffer? Why when I try to run this buffer overflow, do only need 54 characters to overrun the buffer when it the buffer length appears to start 58 bytes away from ebp and 62 bytes away from the return address? Again, I calculated the length to the return address by using ebp-0x3a.

Code:

#include <stdio.h>
#include <string.h>
void printHello(char fname[]);
int main() {
 
    char name[50]; 
    printf("Please enter a name to print a hello message!"); 
    scanf("%[^\n]", name); 

    printHello(name); 
    return 0;
}
void printHello(char fname[50]){

    int strLen = strlen(fname);

    printf("Hello, ");
    for(int i=0; i<strLen; i++){

        printf("%c", fname[i]);
     }
       printf("!\n");
}

Disassembled main function:

Dump of assembler code for function main:
   0x080484fb <+0>: lea    ecx,[esp+0x4]
   0x080484ff <+4>: and    esp,0xfffffff0
   0x08048502 <+7>: push   DWORD PTR [ecx-0x4]
   0x08048505 <+10>:    push   ebp
   0x08048506 <+11>:    mov    ebp,esp
   0x08048508 <+13>:    push   ecx
   0x08048509 <+14>:    sub    esp,0x44
   0x0804850c <+17>:    sub    esp,0xc
   0x0804850f <+20>:    push   0x8048640
   0x08048514 <+25>:    call   0x8048390 <printf@plt>
   0x08048519 <+30>:    add    esp,0x10
   0x0804851c <+33>:    sub    esp,0x8
   0x0804851f <+36>:    lea    eax,[ebp-0x3a]
   0x08048522 <+39>:    push   eax
   0x08048523 <+40>:    push   0x804866e
   0x08048528 <+45>:    call   0x80483e0 <__isoc99_scanf@plt>
   0x0804852d <+50>:    add    esp,0x10
   0x08048530 <+53>:    sub    esp,0xc
   0x08048533 <+56>:    lea    eax,[ebp-0x3a]
   0x08048536 <+59>:    push   eax
   0x08048537 <+60>:    call   0x804854c <printHello>
   0x0804853c <+65>:    add    esp,0x10
   0x0804853f <+68>:    mov    eax,0x0
   0x08048544 <+73>:    mov    ecx,DWORD PTR [ebp-0x4]
   0x08048547 <+76>:    leave  
   0x08048548 <+77>:    lea    esp,[ecx-0x4]
   0x0804854b <+80>:    ret    
End of assembler dump.

Solution

  • I assumed that this is the start of the buffer, until I converted 0x3a to decimal and found out its value was 58.

    That is the start of the buffer, but why would you assume that it should be at a particular offset from ebp? There is no written rule that says that a function should have a stack exactly the size of its local variables. A compiler is pretty much allowed to do whatever it wants. In fact, it could end up using more space in order to preserve register values, maintain alignment, or even just to waste it whenever it feels like so. This has been asked countless times, but unfortunately there really is no definitive answer, you might as well become a GCC developer to try and understand it :')

    Here's some existing questions with excellent answers for reference:

    In addition to the above, you are compiling with no optimizations, as I can tell from nonsensical instructions like add esp,0x10; sub esp,0x8. GCC likes to move stuff back and fort to/from the stack when no optimizations are enabled, and also doesn't really take much care into managing stack space in the best way possible.

    Why when I try to run this buffer overflow, do only need 54 characters to overrun the buffer

    You technically only need 50 characters of input to overrun the buffer (a terminating \0 is automatically added by scanf()). However, those might not be enough to "break" anything.

    To make this clearer, let's assume that initially when main() is called esp is 0x1000. The stack layout at the moment of the call to scanf() (right before call is executed) should be the following if my math is right:

    esp -> 0x0fac: 0x804866e // scanf() arg1
           0x0fb0: 0x0fbe    // scanf() arg2
           0x0fb4: ????
           0x0fb8: ????
           0x0fbc: ??AA <-- eax == 0x0fbe == ebp-0x3a
           0x0fc0: AAAA   
           0x0fc4: AAAA
           0x0fc8: AAAA
           0x0fcc: AAAA
           0x0fd0: AAAA
           0x0fd4: AAAA
           0x0fd8: AAAA
           0x0fdc: AAAA
           0x0fe0: AAAA
           0x0fe4: AAAA
           0x0fe8: AAAA
           0x0fec: AAAA
           0x0ff0: ????
           0x0ff4: 0x1004 // saved original esp+0x4, later used to restore esp
    ebp -> 0x0ff8: <saved ebp>
           0x0ffc: ????
           0x1000: ????   // 0x1000 original esp at start of main()
           0x1004: ????
    

    In the above diagram, the As denote your array, which starts at 0x0fbe.

    You are most probably getting a segmentation fault exactly at 54 (+1 terminator = 55) because that is exactly the bare minimum needed to alter the saved esp+0x4 value (in the example 0x1004) and cause trouble later on when it's used to restore esp (mov ecx,DWORD PTR [ebp-0x4]; leave; lea esp,[ecx-0x4]) ending up with an invalid stack pointer.