Search code examples
gccassemblyx86-64calling-convention

Why does the x86-64 System V calling convention pass args in registers instead of just the stack?


Why is it that 32-bit C pushes all function arguments straight onto the stack while 64-bit C puts the first 6 arguments into registers and the rest on the stack?

So the 32-bit stack would look like:

...
arg2
arg1
return address
old %rbp

While the 64-bit stack would look like:

...
arg8
arg7
return address
old %rbp
arg6
arg5
arg4
arg3
arg2
arg1

So why does 64-bit C do this? Isn't it much easier to just push everything to the stack instead of put the first 6 arguments in registers just to move them onto the stack in the function prologue?


Solution

  • instead of put the first 6 arguments in registers just to move them onto the stack in the function prologue?

    I was looking at some code that gcc generated and that's what it always did.

    Then you forgot to enable optimization. gcc -O0 spills everything to memory so you can modify them with a debugger while single-stepping. That's obviously horrible for performance, so compilers don't do that unless you force them to by compiling with -O0.

    x86-64 System V allows int add(int x, int y) { return x+y; } to compile to
    lea eax, [rdi + rsi] / ret, which is what compilers actually do as you can see on the Godbolt compiler explorer.

    Stack-args calling conventions are slow and obsolete. RISC machines have been using register-args calling conventions since before x86-64 existed, and on OSes that still care about 32-bit x86 (i.e. Windows), there are better calling conventions like __vectorcall that pass the first 2 integer args in registers.

    i386 System V hasn't been replaced because people mostly don't care as much about 32-bit performance on other OSes; we just use 64-bit code with the nicely-designed x86-64 System V calling convention.

    For more about the tradeoff between register args and call-preserved vs. call-clobbered registers in calling convention design, see Why not store function parameters in XMM vector registers?, and also Why does Windows64 use a different calling convention from all other OSes on x86-64?.