Search code examples
cassemblyvisual-c++x86-64calling-convention

Is there any way to save registers before jumping into function?


this is my first question, because I couldn't find anything related to this topic.

Recently, while making a class for my C game engine project I've found something interesting:

struct Stack *S1 = new(Stack);
struct Stack *S2 = new(Stack);

S1->bPush(S1, 1, 2);               //at this point

bPush is a function pointer in the structure.

So I wondered, what does operator -> in that case, and I've discovered:

 mov         r8b,2                 ; a char, written to a low point of register r8
 mov         dl,1                  ; also a char, but to d this time
 mov         rcx,qword ptr [S1]    ; this is the 1st parameter of function
 mov         rax,qword ptr [S1]    ; !Why cannot I use this one?
 call        qword ptr [rax+1A0h]  ; pointer call

so I assume -> writes an object pointer to rcx, and I'd like to use it in functions (methods they shall be). So the question is, how can I do something alike

 push        rcx
 // do other call vars
 pop         rcx
 mov         qword ptr [this], rcx

before it starts writing other variables of the function. Something with preprocessor?


Solution

  • It looks like you'd have an easier time (and get asm that's the same or more efficient) if you wrote in C++ so you could use language built-in support for virtual functions, and for running constructors on initialization. Not to mention not having to manually run destructors. You wouldn't need your struct Class hack.

    I'd like to implicitly pass *this pointer, because as shown in second asm part it does the same thing twice, yes, it is what I'm looking for, bPush is a part of a struct and it cannot be called from outside, but I have to pass the pointer S1, which it already has.

    You get inefficient asm because you disabled optimization.

    MSVC -O2 or -Ox doesn't reload the static pointer twice. It does waste a mov instruction copying between registers, but if you want better asm use a better compiler (like gcc or clang).

    The oldest MSVC on the Godbolt compiler explorer is CL19.0 from MSVC 2015, which compiles this source

    struct Stack {
        int stuff[4];
        void (*bPush)(struct Stack*, unsigned char value, unsigned char length);
    };
    
    
    struct Stack *const S1 = new(Stack);
    
    int foo(){
        S1->bPush(S1, 1, 2);
    
        //S1->bPush(S1, 1, 2);
        return 0;  // prevent tailcall optimization
    }
    

    into this asm (Godbolt)

    # MSVC 2015  -O2
    int foo(void) PROC                                        ; foo, COMDAT
    $LN4:
            sub     rsp, 40                             ; 00000028H
            mov     rax, QWORD PTR Stack * __ptr64 __ptr64 S1
            mov     r8b, 2
            mov     dl, 1
            mov     rcx, rax                   ;; copy RAX to the arg-passing register
            call    QWORD PTR [rax+16]
            xor     eax, eax
            add     rsp, 40                             ; 00000028H
            ret     0
    int foo(void) ENDP                                        ; foo
    

    (I compiled in C++ mode so I could write S1 = new(Stack) without having to copy your github code, and write it at global scope with a non-constant initializer.)

    Clang7.0 -O3 loads into RCX straight away:

    # clang -O3
    foo():
            sub     rsp, 40
            mov     rcx, qword ptr [rip + S1]
            mov     dl, 1
            mov     r8b, 2
            call    qword ptr [rcx + 16]          # uses the arg-passing register
            xor     eax, eax
            add     rsp, 40
            ret
    

    Strangely, clang only decides to use low-byte registers when targeting the Windows ABI with __attribute__((ms_abi)). It uses mov esi, 1 to avoid false dependencies when targeting its default Linux calling convention, not mov sil, 1.


    Or if you are using optimization, then it's because even older MSVC is even worse. In that case you probably can't do anything in the C source to fix it, although you might try using a struct Stack *p = S1 local variable to hand-hold the compiler into loading it into a register once and reusing it from there.)