According to this: What are the calling conventions for UNIX & Linux system calls on i386 and x86-64, in x64-amd System V ABI, the args are passed successively on these registers:
%rdi, %rsi, %rdx, %rcx, %r8 and %r9
, in that order. The 7th and higher arg is passed on the stack. So the question is, how does the callee know, how many and in which order to pop
the remained (7th and more) args? Does the callee know it from argc
? I have and example:
#include <stdio.h>
#include <stdlib.h>
int main(){
int i=0;
printf("\n%i;%i;%i;%i;%i;%i;%i\n",i,i+1,i+2,i+3,i+4,i+5,i+6 );
}
Compiled without optimization:
.text
.section .rodata
.LC0:
.string "\n%i;%i;%i;%i;%i;%i;%i\n"
.text
.globl main
.type main, @function
main:
pushq %rbp #
movq %rsp, %rbp #,
subq $16, %rsp #,
# a.c:5: int i=0;
movl $0, -4(%rbp) #, i
# a.c:6: printf("\n%i;%i;%i;%i;%i;%i;%i\n",i,i+1,i+2,i+3,i+4,i+5,i+6 );
movl -4(%rbp), %eax # i, tmp95
leal 6(%rax), %edi #, _1
movl -4(%rbp), %eax # i, tmp96
leal 5(%rax), %esi #, _2
movl -4(%rbp), %eax # i, tmp97
leal 4(%rax), %r9d #, _3
movl -4(%rbp), %eax # i, tmp98
leal 3(%rax), %r8d #, _4
movl -4(%rbp), %eax # i, tmp99
leal 2(%rax), %ecx #, _5
movl -4(%rbp), %eax # i, tmp100
leal 1(%rax), %edx #, _6
movl -4(%rbp), %eax # i, tmp101
pushq %rdi # _1
pushq %rsi # _2
movl %eax, %esi # tmp101,
leaq .LC0(%rip), %rdi #,
movl $0, %eax #,
call printf@PLT #
addq $16, %rsp #,
movl $0, %eax #, _10
# a.c:7: }
leave
ret
.size main, .-main
.ident "GCC: (Debian 8.3.0-6) 8.3.0"
.section .note.GNU-stack,"",@progbits
Here is somehow misorder those register, it goes (from last):(i am not regarding the size of register here, just ilustration)
di->si->r9->r8->cx->dx
, then si
,di
are pushed and reasign to string address and first argument (i
). So now it seems in correct order. So how does the callee function knows, how many, and in what order to pop ? (si
should be before di
, since si
contains 5
and di
6
)
printf
doesn't know how many arguments there are. It has to trust that the format string matches what you actually passed, and if it's wrong, it'll end up skipping some or reading other random stuff off the stack. Varargs functions that don't take a format string use a different approach to signal the end (e.g., a NULL
sentinel like execlp
uses, or a count variable that the programmer passes manually). Again, if you don't mark the end correctly, it'll read the wrong number of arguments.