Search code examples
linuxassemblyx86-64system-callscalling-convention

Understanding Linux x86_64 Syscall Implementation in NASM


I'm using the linux kernel as the basis for my 'OS' (which is just a game). Obviously, I use system calls to interact with the kernel, e.g. input, sound, graphics, etc. As I don't have a linker on this system to use pre-defined wrapper in <sys/syscall.h>, I use this syscall wrapper for C in NASM asm:

global _syscall
_syscall:
  mov rax, rdi 
  mov rdi, rsi
  mov rsi, rdx
  mov rdx, rcx
  mov r10, r8
  mov r8, r9
  mov r9, [rsp + 8]
  syscall
  ret

I understand the registers used based on System V calling convention. However, I don't understand the last line mov r9, [rsp + 8]. I feel like it has something to do with return address, but I'm not sure.


Solution

  • The 7th arg is passed on the stack in x86-64 System V.

    This is taking the first C arg and putting it in RAX, then copying the next 6 C args to the 6 arg-passing registers of the kernel's system-call calling convention. (Like functions, but with R10 instead of RCX).


    The only reason the craptastic glibc syscall() function exists / is written that way is because there's no way to tell C compilers about a custom calling convention where an arg is also passed in RAX. That wrapper makes it look just like any other C function with a prototype.

    It's fine for messing around with new system calls, but as you noted it's inefficient. If you wanted something better in C, use inline asm macros for your ISA, e.g. https://github.com/linux-on-ibm-z/linux-syscall-support/blob/master/linux_syscall_support.h. Inline asm is hard, and historically some syscall1 / syscall2 (per number of args) macros have been missing things like a "memory" clobber to tell the compiler that pointed-to memory could also be an input or output. That github project is safe and has code for various ISAs. (Some missed optimizations, like could use a dummy input operand instead of a full "memory" clobber... But that's irrelevant to asm)


    Of course, you can do much better if you're writing in asm:

    Just use the syscall instruction directly with args in the right registers (RDI, RSI, RDX, R10, R8, R9) instead of call _syscall with the function-calling convention. That's strictly worse than just inlining the syscall instruction: With syscall you know that registers are unmodified except for RAX (return value) and RCX/R11 (syscall itself uses them to save RIP and RFLAGS before kernel code runs.) And it would take just as much code to get args into registers for a function call as it would for syscall.

    If you do want a wrapper function at all (e.g. to cmp rax, -4095 / jae handle_syscall_error afterwards and maybe set errno), use the same calling convention for it as the kernel expects, so the first instruction can be syscall, not all that stupid shuffling of args over by 1.

    Functions in asm (that you only need to call from asm) can use whatever calling convention is convenient. It's a good idea to use a good standard one most of the time, but any "obviously special" function can certainly use a special convention.