Re-writing a small execve shellcode

Going through http://hackoftheday.securitytube.net/2013/04/demystifying-execve-shellcode-stack.html

I understood the nasm program which invokes execve and was trying to re-write it.

Some background information:

int execve(const char *filename, char *const argv[], char *const envp[]);

So, eax = 11 (function call number for execve), ebx should point to char* filename, ecx should point to argv[] (which will be the same as ebx since the first argument is the *filename itself e.g. "/bin/sh" in this case), and edx will point to envp[] (null in this case).

Original nasm code:

global _start

section .text
_start:

xor eax, eax
push eax

; PUSH //bin/sh in reverse i.e. hs/nib//

push 0x68732f6e
push 0x69622f2f

mov ebx, esp

push eax
mov edx, esp

push ebx
mov ecx, esp

mov al, 11
int 0x80

The stack is as follows:

enter image description here

Now i tried to optimize this by reducing a few instructions. I agree that till mov ebx, esp the code will remain the same. However, since ecx will need to point to ebx, I can re-write the code as follows:

global _start

section .text
_start:

xor eax, eax
push eax

; PUSH //bin/sh in reverse i.e. hs/nib//

push 0x68732f6e
push 0x69622f2f
mov ebx, esp

mov ecx,ebx

push eax
mov edx, esp

mov al, 11
int 0x80

However, I get a segmentation fault when I run my re-written code.

My stack is as follows: enter image description here

Any ideas why the re-written code does not work? I've ran gdb also and the address values are according to my thinking, but it just won't run.

Solution

In both cases ebx is pointing to the string "//bin/sh". The equivalent of C code like this:

char *EBX = "//bin/sh";

But in your first example, ecx is set to the address of a pointer to that string. The equivalent of C code like this:

char *temp = "//bin/sh"; // push ebx
char **ECX = &temp;      // mov ecx, esp

While in your second example, ecx is just set to the same value as ebx.

char *ECX = "//bin/sh";

The two examples are thus fundamentally different, with ecx have two completely different types and values.

Update:

I should add that technically ecx is an array of char pointers (the argv argument), not just a pointer to a char pointer. You're actually building up a two item array on the stack.

char *argv[2];
argv[1] = NULL;         // push eax, eax being zero
argv[0] = "//bin/sh";   // push ebx
ECX = argv;             // mov ecx,esp

It's just that half of that array is doubling as the envp argument too. Since envp is a single item array with that single item being set to NULL, you can think of the envp arguments being set with C code like this:

EDX = envp = &argv[1];

This is achieved by setting edx to esp while the argv array is only half constructed. Combining the code for the two assignments together you get this:

char *argv[2];
argv[1] = NULL;         // push eax, eax being zero
EDX = &argv[1];         // mov edx,esp
argv[0] = "//bin/sh";   // push ebx
ECX = argv;             // mov ecx,esp

It's a bit convoluted, but I hope that makes sense to you.

Update 2

All of the arguments to execve are passed as registers, but those registers are pointers to memory which needs to be allocated somewhere - in this case, on the stack. Since the stack builds downwards in memory, the chunks of memory need to be constructed in reverse order.

The memory for the three arguments looks like this:

char *filename:  2f 2f 62 69 | 6e 2f 73 68 | 00 00 00 00 
char *argv[]:    filename    | 00 00 00 00               
char *envp[]:    00 00 00 00

The filename is constructed like this:

push eax        // '\0' terminator plus some extra
push 0x68732f6e // 'h','s','/','n'
push 0x69622f2f // 'i','b','/','/'

The argv argument like this:

push eax // NULL pointer
push ebx // filename

And the envp argument like this:

push eax // NULL pointer

But as I said, the original example decided to share memory between argv and evp, so there is no need for that last push eax.

I should also note that the reverse order of the characters in the two dwords used when constructing the string is because of the endianess of the machine, not the stack direction.