Search code examples
gccassemblyx86inline-assembly

inline AT&T asm syntax for using opcode directly instead of mnemonic


For unfortunate reasons I can't get into, I have to support an ancient assembler that doesn't have a mapping for a mnemonic I need.

I know the hardware supports it, but I can't seem to find any documentation online for how to use an opcode instead of a mnemonic.

Does anyone have a reference for how to do it in inline AT&T syntax on GCC.


Solution

  • Luckily rdrand only takes a single argument and that is a register. As such you only need to cover a few cases if you want to allow the compiler to choose freely. Beware, it's still quite ugly :)

    inline int rdrand()
    {
        int result;
        __asm__ __volatile__ (
            ".byte 0x0f, 0xc7\n\t"
            ".ifc %0, %%eax\n\t"
            ".byte 0xf0\n\t"
            ".else\n\t"
            ".ifc %0, %%ebx\n\t"
            ".byte 0xf3\n\t"
            ".else\n\t"
            ".ifc %0, %%ecx\n\t"
            ".byte 0xf1\n\t"
            ".else\n\t"
            ".ifc %0, %%edx\n\t"
            ".byte 0xf2\n\t"
            ".else\n\t"
            ".ifc %0, %%esi\n\t"
            ".byte 0xf6\n\t"
            ".else\n\t"
            ".ifc %0, %%edi\n\t"
            ".byte 0xf7\n\t"
            ".else\n\t"
            ".ifc %0, %%ebp\n\t"
            ".byte 0xf5\n\t"
            ".else\n\t"
            ".error \"uknown register\"\n\t"
            ".endif\n\t"
            ".endif\n\t"
            ".endif\n\t"
            ".endif\n\t"
            ".endif\n\t"
            ".endif\n\t"
            ".endif\n\t"
        : "=R" (result) : : "cc");
    
        // "=R" excludes r8d..r15d in 64-bit mode
        return result;
    }
    

    For 64-bit operand-size, you'll need a REX.W (0x48) prefix, but the "=R" constraint instead of "=r" will avoid needing any other bits set in the REX prefix.

    Note that rdrand also uses the carry flag the handling for which is left as an exercise for the reader. gcc6 can use flag output operands, which is more efficient than setcc.