I'm experimenting disassembling clang
binaries of simple C programs (compiled with -O0
), and I'm confused about a certain instruction that gets generated.
Here are two empty main
functions with standard arguments, one of which returns value and other does not:
// return_void.c
void main(int argc, char** argv)
{
}
// return_0.c
int main(int argc, char** argv)
{
return 0;
}
Now, when I disassemble their assemblies, they look reasonably different, but there's one line that I don't understand:
return_void.bin:
(__TEXT,__text) section
_main:
0000000000000000 pushq %rbp
0000000000000001 movq %rsp, %rbp
0000000000000004 movl %edi, -0x4(%rbp)
0000000000000007 movq %rsi, -0x10(%rbp)
000000000000000b popq %rbp
000000000000000c retq
return_0.bin:
(__TEXT,__text) section
_main:
0000000100000f80 pushq %rbp
0000000100000f81 movq %rsp, %rbp
0000000100000f84 xorl %eax, %eax # We return with EAX, so we clean it to return 0
0000000100000f86 movl $0x0, -0x4(%rbp) # What does this mean?
0000000100000f8d movl %edi, -0x8(%rbp)
0000000100000f90 movq %rsi, -0x10(%rbp)
0000000100000f94 popq %rbp
0000000100000f95 retq
It only gets generated when I use the function is not void, so I thought that it might be another way to return 0, but when I changed the returned constant, this line didn't change at all:
// return_1.c
int main(int argc, char** argv)
{
return 1;
}
empty_return_1.bin:
(__TEXT,__text) section
_main:
0000000100000f80 pushq %rbp
0000000100000f81 movq %rsp, %rbp
0000000100000f84 movl $0x1, %eax # Return value modified
0000000100000f89 movl $0x0, -0x4(%rbp) # This value is not modified
0000000100000f90 movl %edi, -0x8(%rbp)
0000000100000f93 movq %rsi, -0x10(%rbp)
0000000100000f97 popq %rbp
0000000100000f98 retq
Why is this line getting generated and what is it's purpose?
The purpose of that area is revealed by the following code
int main(int argc, char** argv)
{
if (rand() == 42)
return 1;
printf("Helo World!\n");
return 0;
}
At the start it does
movl $0, -4(%rbp)
then the early return looks as follows
callq rand
cmpl $42, %eax
jne .LBB0_2
movl $1, -4(%rbp)
jmp .LBB0_3
and then at the end it does
.LBB0_3:
movl -4(%rbp), %eax
addq $32, %rsp
popq %rbp
retq
So, this area is indeed reserved to store the function return value. It doesn't appear to be terribly necessary and it is not used in optimized code, but in -O0
mode that's the way it works.