So I have a problem from my textbook (Computer Systems: A Programmer's Perspective Problem 3.64):
It gives code like this:
typedef struct {
int a;
int *p;
} str1;
typedef struct {
int sum;
int diff;
} str2;
str2 word_sum(str1 s1) {
str2 result;
result.sum = s1.a + *s1.p;
result.diff = s1.a - *s1.p;
return result;
}
int prod(int x, int y) {
str1 s1;
str2 s2;
s1.a = x;
s1.p = &y;
s2 = word_sum(s1);
return s2.sum * s2.diff;
}
and then the assembly code for the prod & word_sum functions:
1 word_sum:
2 pushl %ebp
3 movl %esp, %ebp
4 pushl %ebx
5 movl 8(%ebp), %eax
6 movl 12(%ebp), %ebx
7 movl 16(%ebp), %edx
8 movl (%edx), %edx
9 movl %ebx, %ecx
10 subl %edx, %ecx
11 movl %ecx, 4(%eax)
12 addl %ebx, %edx
13 movl %edx, (%eax)
14 popl %ebx
15 popl %ebp
1 prod:
2 pushl %ebp
3 movl %esp, %ebp
4 subl $20, %esp
5 leal 12(%ebp), %edx
6 leal -8(%ebp), %ecx
7 movl 8(%ebp), %eax
8 movl %eax, 4(%esp)
9 movl %edx, 8(%esp)
10 movl %ecx, (%esp)
11 call word_sum
12 subl $4, %esp
13 movl -4(%ebp), %eax
14 imull -8(%ebp), %eax
15 leave
16 ret
And it asks why prod allocates 20 bytes on the stack in the assembly code line 4.
I can see that it would allocate 8 bytes each for str1 and str2 but I have no idea what the 5th 4-byte memory allocation would be.
Also, would you guys have any recommendations (videos, articles, blog posts) on learning x86 stack frame structure and procedure calls? Very lost in my Computer Architecture course at the moment.
The allocations are 8 bytes for s1
, 8 bytes for s2
, and 4 bytes to pass word_sum
an address to store it's result at.
How did I figure this out?
If we look at the top of prod
, we see:
5 leal 12(%ebp), %edx
6 leal -8(%ebp), %ecx
7 movl 8(%ebp), %eax
Lines 5 and 7 are the only instructions accessing our caller's stack frame, so they must be grabbing x
and y
. We know that we're storing a pointer to y
and line 5 is a lea
instruction, so we can assume that EDX holds &y
and EAX holds x
. This still leaves ECX, which holds a pointer to something in our stack frame.
Moving on, we see that it's storing EAX, EDX, and ECX on our stack, and then calling word_sum
:
8 movl %eax, 4(%esp)
9 movl %edx, 8(%esp)
10 movl %ecx, (%esp)
11 call word_sum
We know that EAX and EDX hold the values that need to be stored in s1
. We know that s1
will be passed to word_sum
, and arguments are passed at the top of the stack. Lines 8 and 9 are storing EAX and EDX very close to the top of the stack, so we can assume this is s1
.
Functions that return a struct expect extra pointer to be passed at the top of the stack. This is the address that it should store it's return value at. The only other thing we're storing on the top of the stack is ECX, and we know that we're storing the result of word_sum
in s2
, so ECX must be a pointer to s2
.
We've now surmised what each register holds; EAX is x
, EDX is &y
, and ECX is &s2
.
If we look lower, we can confirm our expectations:
13 movl -4(%ebp), %eax
14 imull -8(%ebp), %eax
We know that the result of this function is s2.sum * s2.diff
. There's an imul
instruction, and we're multiplying s2.sum
by s2.diff
, so EBP-8 must point to s2.sum
and EBP-4 must point to s2.diff
.
If we backtrack to line 6, we see that EBP-8 was stored in ECX, which we correctly suspected was a pointer to s2
.
In general, debugging problems like this is almost entirely using your knowledge of the code that generated the assembly to make educated guesses, and then using process of elimination to confirm that your guess is correct.