Search code examples
c++gccassemblyx86-64calling-convention

Asm insertion in naked function


I have ubuntu 16.04, x86_64 arch, 4.15.0-39-generic kernel version. GCC 8.1.0

I tried to rewrite this functions(from first post https://groups.google.com/forum/#!topic/comp.lang.c++.moderated/qHDCU73cEFc) from Intel dialect to AT&T. And I did not succeed.

namespace atomic {
  __declspec(naked)
  static void*
  ldptr_acq(void* volatile*) {
    _asm {
      MOV EAX, [ESP + 4]
      MOV EAX, [EAX]
      RET
    }
  }

  __declspec(naked)
  static void*
  stptr_rel(void* volatile*, void* const) {
    _asm {
      MOV ECX, [ESP + 4]
      MOV EAX, [ESP + 8]
      MOV [ECX], EAX
      RET
    }
  }
}

Then I wrote a simple program, to get the same pointer, which I pass inside. I installed GCC version 8.1 with supported naked attributes(https://gcc.gnu.org/gcc-8/changes.html "The x86 port now supports the naked function attribute") for fuctions. As far as I remember, this attribute tells the compiler not to create the prologue and epilog of the function, and I can take the parameters from the stack myself and return them. Code:(don't work with segfault)

#include <cstdio>
#include <cstdlib>

  __attribute__ ((naked))
  int *get_num(int*) {
    __asm__  (
      "movl 4(%esp), %eax\n\t"
      "movl (%eax), %eax\n\t"
      "ret"
    );
  }

int main() {
    int *i =(int*) malloc(sizeof(int));
    *i = 5;

    int *j = get_num(i);
    printf("%d\n", *j);

    free(i);
    return 0;
}

then I tried using 64bit registers:(don't work with segfault)

__asm__  (
  "movq 4(%rsp), %rax\n\t"
  "movq (%rax), %rax\n\t"
  "ret"
);

And only after I took the value out of rdi register - it all worked.

__asm__  (
  "movq %rdi, %rax\n\t"
  "ret"
);

Why did I fail to make the transfer through the stack register? I probably made a mistake. Please tell me where is my fail?


Solution

  • Because the x86-64 System V calling convention passes args in registers, not on the stack, unlike the old inefficient i386 System V calling convention.

    You always have to write asm that matches the calling convention, if you're writing the whole function in asm, like with a naked function or a stand-along .S file.

    GNU C extended asm allows you to use operands to specify the inputs to an asm statement, and the compiler will generate instructions to make that happen. (I wouldn't recommend using it until you understand asm and how compilers turn C into asm with optimization enabled, though.)


    Also note that movq %rdi, %rax implements long *foo(long*p){return p;} not return *p. Perhaps you meant mov (%rdi), %rax to dereference the pointer arg?


    And BTW, you definitely don't need and shouldn't use inline asm for this. https://gcc.gnu.org/wiki/DontUseInlineAsm, and see https://stackoverflow.com/tags/inline-assembly/info

    In GNU C, you can cast a pointer to volatile uint64_t*. Or you can use __atomic_load_n (ptr, __ATOMIC_ACQUIRE) to get basically everything you were getting from that asm, without the overhead of a function call or any of the cost for the optimizer at the call-site of having all the call-clobbered registers be clobbered.

    You can use them on any object: https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html Unlike C++11 where you can only do atomic ops on a std::atomic<T>.