Stack View when printf is called?

I was just learning about format string vulnerabilities that makes me ask this question

Consider the following simple program:

#include<stdio.h>
void main(int argc, char **argv)
{
char *s="SomeString";
printf(argv[1]);
}

Now clearly, this code is vulnerable to a format String vulnerability. I.e. when the command line argument is %s, then the value SomeString is printed since printf pops the stack once.

What I dont understand is the structure of the stack when printf is called

In my head I imagine the stack to be as follows:

grows from left to right ----->

main()                                                                  ---> printf()-->
RET to libc_main | address of 's' | current registers| ret ptr to main | ptr to format string|

if this is the case, how does inputting %s to the program, cause the value of s to be popped ?

(OR) If I am totally wrong about the stack structure , please correct me

Solution

The stack contents depends a lot on the following:

the CPU
the compiler
the calling conventions (i.e. how parameters are passed in the registers and on the stack)
the code optimizations performed by the compiler

This is what I get by compiling your tiny program with x86 mingw using gcc stk.c -S -o stk.s:

        .file   "stk.c"
        .def    ___main;        .scl    2;      .type   32;     .endef
        .section .rdata,"dr"
LC0:
        .ascii "SomeString\0"
        .text
        .globl  _main
        .def    _main;  .scl    2;      .type   32;     .endef
_main:
LFB6:
        .cfi_startproc
        pushl   %ebp
        .cfi_def_cfa_offset 8
        .cfi_offset 5, -8
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        andl    $-16, %esp
        subl    $32, %esp
        call    ___main
        movl    $LC0, 28(%esp)
        movl    12(%ebp), %eax
        addl    $4, %eax
        movl    (%eax), %eax
        movl    %eax, (%esp)
        call    _printf
        leave
        .cfi_restore 5
        .cfi_def_cfa 4, 4
        ret
        .cfi_endproc
LFE6:
        .def    _printf;        .scl    2;      .type   32;     .endef

And this is what I get using gcc stk.c -S -O2 -o stk.s, that is, with optimizations enabled:

        .file   "stk.c"
        .def    ___main;        .scl    2;      .type   32;     .endef
        .section        .text.startup,"x"
        .p2align 2,,3
        .globl  _main
        .def    _main;  .scl    2;      .type   32;     .endef
_main:
LFB7:
        .cfi_startproc
        pushl   %ebp
        .cfi_def_cfa_offset 8
        .cfi_offset 5, -8
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        andl    $-16, %esp
        subl    $16, %esp
        call    ___main
        movl    12(%ebp), %eax
        movl    4(%eax), %eax
        movl    %eax, (%esp)
        call    _printf
        leave
        .cfi_restore 5
        .cfi_def_cfa 4, 4
        ret
        .cfi_endproc
LFE7:
        .def    _printf;        .scl    2;      .type   32;     .endef

As you can see, in the latter case there's no pointer to "SomeString" on the stack. In fact, the string isn't even present in the compiled code.

In this simple code there are no registers saved on the stack because there aren't any variables allocated to registers that need to be preserved across the call to printf().

So, the only things you get on the stack here are the string pointer (optionally), unused space due to stack alignment (andl $-16, %esp + subl $32, %esp align the stack and allocate space for local variables, none here), the printf()'s parameter, the return address for returning from printf() back to main().

In the former case the pointer to "SomeString" and the printf()'s parameter (value of argv[1]) are quite far away from one another:

        movl    $LC0, 28(%esp) ; address of "SomeString" is at esp+28
        movl    12(%ebp), %eax
        addl    $4, %eax
        movl    (%eax), %eax
        movl    %eax, (%esp) ; address of a copy of argv[1] is at esp
        call    _printf

To make the two addresses stored one right after the other on the stack, if that's what you want, you'd need to play with the code, compilation/optimization options or use a different compiler.

Or you could supply a format string in argv[1] such that printf() would reach it. You could, for example, include a number of fake parameters in the format string.

For example, if I compile this piece of code using gcc stk.c -o stk.exe and run it as stk.exe %u%u%u%u%u%u%s, I'll get the following output from it:

4200532268676042006264200532880015253SomeString

All of this is pretty hacky and it's not trivial to make it work right.