Search code examples
cassemblygccx86inline-assembly

How to write inline assembly to bit rotate


I was reading gcc's guide on extended ASM and I'm running into a problem where the compiler isn't interpreting the assembly the way I thought it would. I thought I'd try it with a bit rotate instruction since those aren't readily available in C.

Here's my C function:

int rotate_right(int num,int count) {
    asm (
        "rcr %[value],%[count]"
        : [value] "=r" (num)
        : [count] "r" (count)
        );

    return num;
}

And the compiled output using x86-64 gcc (trunk) -O0:

        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     DWORD PTR [rbp-8], esi
        mov     eax, DWORD PTR [rbp-8]
        rcr eax,eax
        mov     DWORD PTR [rbp-4], eax
        mov     eax, DWORD PTR [rbp-4]
        pop     rbp
        ret

The problem I'm having is that GCC is taking my inline assembly to mean "rotate EAX by EAX rather than by the count parameter I intended. This is what I expected to get:

        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], edi
        mov     DWORD PTR [rbp-8], esi
        mov     eax, DWORD PTR [rbp-8]
        mov     ecx, DWORD PTR [rbp-4]
        rcr     eax,ecx
        pop     rbp
        ret

Solution

  • Use a +r constraint for num indicating that num will be read and not just written to. Otherwise gcc will assume that the previous value of num doesn't matter and just picks an unused register to dump the output into.

    You'll also have to use a c constraint for count as the shift amount must be in cl for the ror instruction. Refer to the other answer for a more detailed explanation.

    Before doing any inline assembly programming, read the manual carefully! It is somewhat tricky to get right and there are many subtle details to pay attention to.

    Also note that even if the inline assembly seems to work right, it is possible to be incorrect, e.g. due to missing clobbers that just so happen to not affect anything relevant with this particular compiler version at this particular optimisation level for this particular version of the code. So be extra careful and try to avoid using it if possible.

    For example in your case, you can just use the standard C rotation idiom. The compiler will pick it up as long as optimisations are enabled:

    #include <limits.h>
    
    int rotate_right(int num,int count) {
        return ((unsigned)num >> count | num << CHAR_BIT * sizeof num - count);
    }