Search code examples
cassemblyinline-assemblycpu-registersmov

inline assembly - useless intermediate copy instructions


I'm trying to write a scheduler to run what we call "fibers". Unfortunately, I'm not really used to writing inline assembly.

typedef struct {
    //fiber's stack
    long rsp;
    long rbp;

    //next fiber in ready list
    struct fiber *next;

} fiber;

//currently executing fiber
fiber *fib;

So the very first task is - obviously - creating a fiber for the main function so it can be suspended.

int main(int argc, char* argv[]){

    //create fiber for main function
    fib = malloc(sizeof(*fib));
    __asm__(
        "movq %%rsp, %0;"
        "movq %%rbp, %1;"
         : "=r"(fib->rsp),"=r"(fib->rbp)
         );

    //jump to actual main and execute
    __asm__(...);

}

This gets compiled to

    movl    $24, %edi   #,
    call    malloc  #
#APP
# 27 "scheduler.c" 1
    movq %rsp, %rcx;movq %rbp, %rdx;    # tmp92, tmp93
# 0 "" 2
#NO_APP
    movq    %rax, fib(%rip) # tmp91, fib
    movq    %rcx, (%rax)    # tmp92, MEM[(struct fiber *)_3].rsp
    movq    %rdx, 8(%rax)   # tmp93, MEM[(struct fiber *)_3].rbp

Why does this compile movs into temporary registers? Can I somehow get rid of them?

The first version of this question had asm output from gcc -O0, with even more instructions and temporaries.

Turning on optimisations does not get rid of them.


Solution

  • turning them on does not get rid of the temporaries

    It did get rid of some extra loads and stores. The fib is of course still there in memory since you declared that as a global variable. The rax is the return value from the malloc that must be assigned to the fib in memory. The other two lines write into your fib members which are also required.

    Since you specified register outputs the asm block can't write directly into memory. That's easy to fix with a memory constraint though:

    __asm__(
        "movq %%rsp, %0;"
        "movq %%rbp, %1;"
         : "=m"(fib->rsp),"=m"(fib->rbp)
         );
    

    This will generate:

        call    malloc
        movq    %rax, fib(%rip)
        movq    %rsp, (%rax)
        movq    %rbp, 8(%rax)