Search code examples
cassemblybuffer-overflow

pushing and changing of %esp frame pointer


I have a small program, written in C, echo():

/* Read input line and write it back */
void echo() {
    char buf[8];  /* Way too small! */
    gets(buf);
    puts(buf);
}

The corresponding assembly code:

1 echo:
2 pushl %ebp                //Save %ebp on stack
3 movl  %esp, %ebp          
4 pushl %ebx                //Save %ebx
5 subl  $20, %esp           //Allocate 20 bytes on stack
6 leal  -12(%ebp), %ebx     //Compute buf as %ebp-12
7 movl  %ebx, (%esp)        //Store buf at top of stack
8 call  gets                //Call gets
9 movl  %ebx, (%esp)        //Store buf at top of stack
10 call puts                //Call puts
11 addl $20, %esp           //Deallocate stack space
12 popl %ebx                //Restore %ebx
13 popl %ebp                //Restore %ebp
14 ret                      //Return

I have a few questions.

  1. Why does the %esp allocate 20 bytes? The buf is 8 bytes, why the extra 12?

  2. The return address is right above where we pushed %ebp right? (Assuming we draw the stack upside down, where it grows downward) What is the purpose of the old %ebp (which the current %ebp is pointing at, as a result of line 3)?

  3. If i want to change the return address (by inputting anything more than 12 bytes), it would change where echo() returns to. What is the consequence of changing the old %ebp (aka 4 bytes before the return address)? Is there any possibility of changing the return address or where echo returns to by just changing the old %ebp?

  4. What is the purpose of the %ebp? I know its the frame pointer but, what is that?

  5. Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?

*c code and assembly copied from Computer Systems - A programmer's Perspective 2nd ed.

** Using gets() because doing buffer overflows


Solution

  • The reason 20 bytes are allocated is for the purpose of stack alignment. GCC 4.5+ generates code that ensures that the callee's local stack space is aligned to a 16-byte boundary, in order to ensure that compiled code can do aligned SSE loads and stores on the stack in a well-defined manner. For that reason, the compiler in this case needs to throw away some stack-space in order to ensure that gets/puts get a properly aligned frame.

    In essence, this is how the stack will look, where each line is a 4-byte word except for --- lines that denote 16-byte address boundaries:

    ...
    Saved EIP from caller
    Saved EBP
    ---
    Saved EBX       # This is where echo's frame starts
    buf
    buf
    Unused
    ---
    Unused
    Parameter to gets/puts
    Saved EIP
    Saved EBP
    ---
    ...             # This is where gets'/puts' frame starts
    

    As you can hopefully see from my fantastic ASCII graphics, if it weren't for the "unused" portions, gets/puts would get an unaligned frame. Do note also, however, that not 12 bytes are unused; 4 of them are reserved for the parameter.

    Is it ever possible for the compiler to put the buffer somewhere that is not right next to where the old %ebp is stored? Like if we declare buf[8] but it stores it at -16(%ebp) instead of -12(%ebp) on line 6?

    Certainly. The compiler is free to organize the stack however it feels like. In order to do buffer overflows predictably, you have to be looking at a specific compiled binary of a program.

    As for what the purpose of EBP is (and thus to answer your questions 2, 3 and 5), please see any introductory text to how the call stack is organized, such as the Wikipedia article.