I’m working on practice problem 3.34 in Computer Systems a Programmers Perspective and trying to understand exactly what is happening. The question states "Consider a function P
, which generates local values, named a0-a7
. It then calls function Q
using these generated values as arguments. GCC produces the following code for the first part of P
". We are given the following assembly code:
/* long P(long x)
* x in %rdi */
P:
pushq %r15
pushq %r14
pushq %r13
pushq %r12
pushq %rbp
pushq %rbx
subq $24, %rsp
leaq 1(%rdi), %r15
leaq 2(%rdi), %r14
leaq 3(%rdi), %r13
leaq 4(%rdi), %r12
leaq 5(%rdi), %rbp
leaq 6(%rdi), %rax
movq %rax, (%rsp)
leaq 7(%rdi), %rdx
movq %rdx, 8(%rsp)
movl $0, %eax
call Q
So far, this is what I understand:
The instructions pushq %r15
through pushq %rbx
Are being pushed to the stack so as to preserve those values, and eventually replace them in their respective registers when procedure P
returns (Since they are callee saved registers).
I see that the instruction subq $24, %rsp
allocates 24 bytes of space on the stack.
I have two questions though:
long x
and storing that new memory address (after adding 1 or 2 or ... 7) in the various callee saved registers. Is this correct? I'm a bit confused as to the value that they store? Is there any significance to it? Also, what will function Q
do with these registers? How does it know to use them as arguments, since they don't seem to be the argument registers? Only long x
is passed on as an argument (as it is in register %rdi
.???????????????????????????????????:16
The result of 7(%rdi) (argument a7):8
The result of 6(%rdi) (argument a6):0 <--- %rsp
I cant seem to account for what is contained in bytes 16-23 :(
Thank you soo much in advance, I'm really struggling with this one.
First, note that there is an erratum for this practice problem. The local variables are not passed as arguments to Q
; rather Q
is being called with no arguments. So that explains why the variables aren't in the argument-passing registers.
(The strange zeroing of eax
may be explained by Differences in the initialization of the EAX register when calling a function in C and C++ ; they might have accidentally declared void Q();
instead of void Q(void);
. I'm not sure why the compiler emitted movl $0, %eax
instead of the more efficient xorl %eax, %eax
; it looks like optimizations are on, and that's a very basic optimization.)
Now as for lea
, it's really just an arithmetic instruction, and compilers tend to use it that way. See What's the purpose of the LEA instruction?. So leaq 1(%rdi), %r15
simply adds 1 to the value in rdi
and writes the result to r15
. Whether the value in rdi
represented an address or a number or something else is irrelevant to the machine. Since rdi
contained the argument x
, this is effectively doing
a0 = x + 1;
a1 = x + 2;
a2 = x + 3;
// ...
The alternative would be something like movq %rdi, %r15 ; addq $1, %r15
which is more instructions.
Of course, these values are being put in callee-saved registers (or memory, for the last two) so that they are not destroyed by the call to Q()
.
As for the stack, the x86-64 ABI requires 16-byte stack alignment. The stack pointer was a multiple of 16 before the call to P
, and it must again be a multiple of 16 when we call Q
. Our caller's call P
pushed 8 bytes to the stack, and the various register pushes in the prologue push 48 bytes more. So in order to end up with a multiple of 16, we must adjust the stack pointer by 8 more than a multiple of 16 (i.e. an odd multiple of 8). We need 16 bytes for local variables, so we must adjust the stack pointer by 24. That leaves 8 bytes of stack that just won't be used for anything, which is your ??????
at 16(%rsp)
.