I wrote a C program with an intentional buffer overflow for a class assignment. In my program, I have a main function to accept a name from the user as a character array of length 50. The name is then passed as a character array of length 50, where the message "Hello, user!"
is printed. User is replaced with the name provided by the user. I do not do any length checking on the scanf()
function, but instead get input until a new line character is encountered. As a result, I am able to overrun the buffer, overwrite the return address of main and cause a segmentation fault.
When I disassemble main
, using the GDB commands, I am able to see that the address [ebp - 0x3a]
is loaded and pushed into the stack to be used as an argument for the scanf
function (see photo below). I assumed that this is the start of the buffer, until I converted 0x3a to decimal and found out its value was 58. Why would there be an additional 8 bytes allotted to the character buffer? Why when I try to run this buffer overflow, do only need 54 characters to overrun the buffer when it the buffer length appears to start 58 bytes away from ebp and 62 bytes away from the return address? Again, I calculated the length to the return address by using ebp-0x3a
.
Code:
#include <stdio.h>
#include <string.h>
void printHello(char fname[]);
int main() {
char name[50];
printf("Please enter a name to print a hello message!");
scanf("%[^\n]", name);
printHello(name);
return 0;
}
void printHello(char fname[50]){
int strLen = strlen(fname);
printf("Hello, ");
for(int i=0; i<strLen; i++){
printf("%c", fname[i]);
}
printf("!\n");
}
Disassembled main
function:
Dump of assembler code for function main:
0x080484fb <+0>: lea ecx,[esp+0x4]
0x080484ff <+4>: and esp,0xfffffff0
0x08048502 <+7>: push DWORD PTR [ecx-0x4]
0x08048505 <+10>: push ebp
0x08048506 <+11>: mov ebp,esp
0x08048508 <+13>: push ecx
0x08048509 <+14>: sub esp,0x44
0x0804850c <+17>: sub esp,0xc
0x0804850f <+20>: push 0x8048640
0x08048514 <+25>: call 0x8048390 <printf@plt>
0x08048519 <+30>: add esp,0x10
0x0804851c <+33>: sub esp,0x8
0x0804851f <+36>: lea eax,[ebp-0x3a]
0x08048522 <+39>: push eax
0x08048523 <+40>: push 0x804866e
0x08048528 <+45>: call 0x80483e0 <__isoc99_scanf@plt>
0x0804852d <+50>: add esp,0x10
0x08048530 <+53>: sub esp,0xc
0x08048533 <+56>: lea eax,[ebp-0x3a]
0x08048536 <+59>: push eax
0x08048537 <+60>: call 0x804854c <printHello>
0x0804853c <+65>: add esp,0x10
0x0804853f <+68>: mov eax,0x0
0x08048544 <+73>: mov ecx,DWORD PTR [ebp-0x4]
0x08048547 <+76>: leave
0x08048548 <+77>: lea esp,[ecx-0x4]
0x0804854b <+80>: ret
End of assembler dump.
I assumed that this is the start of the buffer, until I converted 0x3a to decimal and found out its value was 58.
That is the start of the buffer, but why would you assume that it should be at a particular offset from ebp
? There is no written rule that says that a function should have a stack exactly the size of its local variables. A compiler is pretty much allowed to do whatever it wants. In fact, it could end up using more space in order to preserve register values, maintain alignment, or even just to waste it whenever it feels like so. This has been asked countless times, but unfortunately there really is no definitive answer, you might as well become a GCC developer to try and understand it :')
Here's some existing questions with excellent answers for reference:
In addition to the above, you are compiling with no optimizations, as I can tell from nonsensical instructions like add esp,0x10; sub esp,0x8
. GCC likes to move stuff back and fort to/from the stack when no optimizations are enabled, and also doesn't really take much care into managing stack space in the best way possible.
Why when I try to run this buffer overflow, do only need 54 characters to overrun the buffer
You technically only need 50 characters of input to overrun the buffer (a terminating \0
is automatically added by scanf()
). However, those might not be enough to "break" anything.
To make this clearer, let's assume that initially when main()
is called esp
is 0x1000
. The stack layout at the moment of the call to scanf()
(right before call
is executed) should be the following if my math is right:
esp -> 0x0fac: 0x804866e // scanf() arg1
0x0fb0: 0x0fbe // scanf() arg2
0x0fb4: ????
0x0fb8: ????
0x0fbc: ??AA <-- eax == 0x0fbe == ebp-0x3a
0x0fc0: AAAA
0x0fc4: AAAA
0x0fc8: AAAA
0x0fcc: AAAA
0x0fd0: AAAA
0x0fd4: AAAA
0x0fd8: AAAA
0x0fdc: AAAA
0x0fe0: AAAA
0x0fe4: AAAA
0x0fe8: AAAA
0x0fec: AAAA
0x0ff0: ????
0x0ff4: 0x1004 // saved original esp+0x4, later used to restore esp
ebp -> 0x0ff8: <saved ebp>
0x0ffc: ????
0x1000: ???? // 0x1000 original esp at start of main()
0x1004: ????
In the above diagram, the A
s denote your array, which starts at 0x0fbe
.
You are most probably getting a segmentation fault exactly at 54 (+1 terminator = 55) because that is exactly the bare minimum needed to alter the saved esp+0x4
value (in the example 0x1004
) and cause trouble later on when it's used to restore esp
(mov ecx,DWORD PTR [ebp-0x4]; leave; lea esp,[ecx-0x4]
) ending up with an invalid stack pointer.