c linux assembly stack-overflow buffer-overflow

How does system() affect the stack in x64 linux?

I'm reading through Jon Erickson's excellent book "Hacking: the Art of Exploitation" and am trying to understand his exposition of buffer overflows. The book does seem a bit dated; in his examples he is running x86 linux, and I am having trouble replicating results on x64 (I do know that greater stack protection has been added in recent years). In particular I am struggling to replicate his exploit_notesearch.c program.

Early in the book he demonstrated a program notesearch.c that runs suid root and has the following initial lines, after library inclusions and function declarations:

int main(int argc, char *argv[]) {
    int userid, printing=1, fd;
    char searchstring[100];

    if(argc>1)
        strcpy(searchstring, argv[1]);
    else
        searchstring[0]=0;
...

Now, Erickson later demonstrates an exploit for this program, called exploit_notesearch.c, the skeleton of which is:

char shellcode[]=
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a"
"\x0b\x58\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69"
"\x6e\x89\xe3\x51\x89\xe2\x53\x89\xe1\xcd\x80";

int main(int argc, char *argv[]) {
    unsigned int i, *ptr, ret, offset=170;
    char *command, *buffer;
...
    if(argc>1)
        offset=atoi(argv[1]);
    ret=(unsigned int) &i-offset;
...
    system(command);
    free(command);
}

The areas that I've elided are just copying the right data into command, starting with writing "./notesearch '", then injecting 60 NOP bytes, then injecting the data held in shellcode, and then filling the rest of the allocated memory with the address ret and ending the string with '.

As I understand it, the idea of the exploit should be as follows. Upon executing the line system(command), the system will push a new stack frame for the main function of notesearch onto the stack. At the bottom of this stack frame is the address that EIP should return to upon completing the main function, and somewhere in the middle is space allocated for the searchstring buffer. ret is meant to approximate the start of the space allocated for searchstring, which we overwrite with NOP instructions (as a fudge factor), the shellcode (which upon execution opens a root shell), and then dozens of copies of the address ret to ensure that we will overwrite the return address for EIP. The system executes main as normal, but then, rather than returning to an address in the code of exploit_notesearch, instead returns to the address ret, and proceeds the execute the shellcode as desired. The idea behind defining ret is that i lives somewhere in the stack frame directly above the stack frame for the main function of notesearch.c, so searchstring should not live too far away from i, and hence by experimenting with different offsets we should be able to find one that works. (The NOP sled means we do not have to be completely precise.)

I think I have mostly understood this correctly, but there a few problems. The principal one is that my understanding of how system() works is based on guesswork, as Erickson does not detail how it works. To understand what's going on I've tried to rewrite this program to be compatible with x64 linux, making the following changes:

replacing the shellcode given by Erickson with shellcode written for x64
changing i, ret, and offset to unsigned long ints and changing the increment of i in the for loop to 8, to account for larger memory addresses
compiling both notesearch.c and exploit_notesearch.c with the -fno-stack-protector flag

However, this has not worked at all, so for debugging I added a line to notesearch.c that prints the address searchstring and a line to exploit_notesearch.c that prints the address ret. Upon running ./exploit_notesearch several times I got strange results:

trial 1:
ret:          0x7ffdc21f25ce
searchstring: 0x7ffee3c209a0

trial 2:
ret:          0x7fff6115703e
searchstring: 0x7ffd1233afb0

trial 3:
ret:          0x7ffeab00781e
searchstring: 0x7fff3c8a8760

So, what's going on here? It seems like calling system() changes the stack in really unpredictable ways, sometimes putting the new stack frame below the old one and sometimes putting it above. Debugging with gdb did not help, as it seems the whole call of system() is bundled up into a single line call 0x555555554710 <system@plt>, which did not provide any insight.

So, my main questions are these:

how do shell commands called with system() interact with the stack?
is this done differently in x64 linux than it was in x86, or have I really misunderstood the code written by Erickson?
is there a way to disable these security measures on x64 linux when compiling, so that I can follow along with Erickson's code while I am learning?

Apologies for the long-winded question and thanks in advance.

Edit: Per the suggestion of Jester below I have disabled ASLR and now the program works correctly. As a follow-up question then, does anyone have any references for understanding ASLR? Cheers!

Solution

Upon executing the line system(command), the system will push a new stack frame for the main function of notesearch onto the stack

No. That's completely wrong. system(xxx) is a convenient library wrapper for the execve syscall, which does a fork first to run the process as a child:

system("xxx");

// Roughly equivalent to:

int wstatus;
pid_t child = fork();

if (child == -1) {
    return -1;
} else if (child == 0) {
    execve("/bin/sh", ["/bin/sh", "-c", "xxx"], envp); // execute shell in child
} else {
    waitpid(child, &wstatus, 0); // wait for child to complete in parent
    return WEXITSTATUS(wstatus);
}

It starts a new shell that executes the program (or command[s]) you pass as argument. When you do this, a new child that is equal to the parent is created by fork, and then, in the child, the program is wiped away from the operating system, and replaced with the new one by execve. A new stack is created, and the new program starts.

how does system() interact with the stack?

It doesn't interact in any particular way, it's just a normal library function like I said above. When the execve syscall is executed, the forked clone of this process is replaced by the kernel with a freshly-initialized process its own virtual address space (doing ASLR for it separately). Then it waits for that shell process to exit. None of this has any effect on the address-space of the process that called system().

is this done differently in x64 linux than it was in x86, or have I really misunderstood the code written by Erickson?

You definitely have misunderstood the code. What that system() is doing is simply running the vulnerable program with a precisely crafted argv[1] to cause a buffer overflow and overwrite the return address of the main() function leading to RIP overwrite and control of the execution.

It seems like calling system() changes the stack in really unpredictable ways, sometimes putting the new stack frame below the old one and sometimes putting it above.

Of course, since system() merely creates a new process with execve.

is there a way to disable these security measures on x64 linux when compiling, so that I can follow along with Erickson's code while I am learning?

Yes, you can disable ASLR to stop the kernel from randomizing the position of the stack:

sudo sysctl -w kernel.randomize_va_space=0

gdb should already do this for you but only if the process is started from inside gdb.