Search code examples
cassemblyx86-64stack-memory

Why does my buffer have more memory allocated on the stack than I asked for?


Here's my source code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX 500

int main(int argc, char** argv)
{
        if (argc != 2)
                exit(1);
        char str[MAX];
        strcpy(str, argv[1]);
        return 0;
}

I disassembled main using gdb and got the following result:

Dump of assembler code for function main:
   0x0000000000001145 <+0>:     push   %rbp
   0x0000000000001146 <+1>:     mov    %rsp,%rbp
   0x0000000000001149 <+4>:     sub    $0x210,%rsp
   .
   .
   .
End of assembler dump.

Here the notable thing is:

0x0000000000001149 <+4>: sub $0x210,%rsp

and my question is-
Why is there $0x210 (528 bytes), when it should be $0x1f4 (500 bytes) as I asked for?


Solution

  • I am guessing you are using gcc and compiling without optimizations, like this (godbolt).

    There are a couple things going on here:

    First, when compiling without optimizations, the compiler tries to ensure that every local variable has an address in memory, so that it can easily be inspected or modified by a debugger. This includes function parameters, which on x86-64 are otherwise passed in registers. So the compiler needs to allocate additional stack space where the argc and argv parameters can be "spilled". You can see the spilling at lines 5 and 6 of the assembly:

            movl    %edi, -516(%rbp)
            movq    %rsi, -528(%rbp)
    

    If you look carefully, you may note that the compiler wasted 4 bytes by placing argc (from %edi) at address -516(%rbp) when -520(%rbp) was otherwise available. It's not entirely clear why, but after all, it's not optimizing! So that gets us to 516 bytes.

    The other issue is that the x86-64 ABI requires 16-byte stack alignment; see Why does the x86-64 / AMD64 System V ABI mandate a 16 byte stack alignment?. In this case, to make a long story short, it implies that our stack adjustment needs to be a multiple of 16 bytes. (The return address and pushed rbp add a further 16 bytes which doesn't disturb this alignment.) So our 516 must be rounded up to the next multiple of 16, which is 528.

    If the compiler had been more careful and not wasted that 4 bytes in between argc and argv, we could have got away with only 512 bytes. One benefit of using 528, though, is that the buffer str ends up 16-byte aligned. This isn't required for an array of char, whose minimum alignment is just 1, but it can make it more efficient for string functions like strcpy to use fast SIMD algorithms. I am not sure if the compiler is doing this deliberately or if it's just a coincidence.