I decided it would be fun to learn x86 assembly during the summer break. So I started with a very simple hello world program, borrowing on free examples gcc -S
could give me. I ended up with this:
HELLO:
.ascii "Hello, world!\12\0"
.text
.globl _main
_main:
pushl %ebp # 1. puts the base stack address on the stack
movl %esp, %ebp # 2. puts the base stack address in the stack address register
subl $20, %esp # 3. ???
pushl $HELLO # 4. push HELLO's address on the stack
call _puts # 5. call puts
xorl %eax, %eax # 6. zero %eax, probably not necessary since we didn't do anything with it
leave # 7. clean up
ret # 8. return
# PROFIT!
It compiles and even works! And I think I understand most of it.
Though, magic happens at step 3. Would I remove this line, my program would die between the call to puts
and the xor
from a misaligned stack error. And would I change $20
to another value, it'd crash too. So I came to the conclusion that this value is very
important.
Problem is, I don't know what it does and why it's needed.
Can anyone explain me? (I'm on Mac OS, would it ever matter.)
On x86 OSX, the stack needs to be 16 byte aligned for function calls, see ABI doc here. So, the explanation is
push stack pointer (#1) -4 strange increment (#3) -20 push argument (#4) -4 call pushes return address (#5) -4 total -32
To check, change line #3 from $20 to $4, which also works.
Also, Ignacio Vazquez-Abrams points out, #6 is not optional. Registers contain remnants of previous calculations so it has to explicitly be zeroed.
I recently learned (still learning) assembly, too. To save you the shock, 64bit calling conventions are MUCH different (parameters passed on the register). Found this very helpful for 64bit assembly.