I am trying to understand FPU, and I am pretty confused. The problem is that as I understand from here, FPU has its own stack. But for example in this code (NASM):
global _main
extern _printf
section .data
hellomessage db `Hello World!\n`, 10, 0
numone dd 1.2
digitsign db '%f', 0xA, 0
section .text
_main:
;Greet the user
push hellomessage
call _printf
add esp,4
sub esp, 8
fld dword[numone]
fstp qword[esp]
push digitsign
call _printf
add esp, 12
ret
I have to have the sub esp, 8
line to "make space" for a double
, otherwise the program crashes. But by doing this, I change the pointer of the "regular stack", which does not make sense with my idea of two separate stacks.
I am certain that I do not understand something, but I do not know what this is.
x87 loads/stores use the same memory addresses that everything else does. The x87 stack is registers st0..st7, not memory at all.
See SIMPLY FPU: Chap. 1 Description of FPU Internals for details on the x87 register stack.
fstp qword[esp]
stores 8 bytes to the regular call stack, like mov [esp], eax
/ mov [esp+4], edx
would do. Addressing modes don't change meaning when used with x87 load/store instructions! i.e. your process only has one address space.
So if you remove the sub esp, 8
, your fstp
would overwrite your return address.
Then at the end of the function, add esp, 12
would leave esp
pointing 8 bytes above that, so ret
will pop some garbage into EIP and then you segfault when trying to fetch code from that bad address, or the bytes there decode to instructions which segfault.
Above main
's return address, you'll find argc
and then char **argv
. It's a pointer to an array of pointers, so using it as a return address will mean you execute pointer values as code. (If I got this right.)
Use a debugger to see what happens to registers and memory as you single step.
Note that add esp,4
/ sub esp, 8
is a bit silly. add esp, +4 - 8
(i.e. add esp, -4
) would be a self-documenting way to do that with one instruction.