Search code examples
cgccassemblyx86inline-assembly

How to write multiple assembly statements within asm() without "\t\n" separating each line using GCC?


How to write multiple assembly statements within asm() without "\t\n" separating each line using GCC?

I've seen some textbooks write multiple assembly statements within asm() as:

asm("
movl $4, %eax
movl $2, %ebx
addl %eax, %ebx
...
");

However, my compiler (GCC) doesn't recognize this syntax. Instead, I must rely on "\t\n" separating each line or using multiple asm():

asm(
"movl $4, %eax\t\n"
"movl $2, %ebx\t\n"
"addl %eax, %ebx\t\n"
...);

or

asm("movl $4, %eax");
asm("movl $2, %ebx");
asm("addl %eax, %ebx");
...

How do I enable the "clean" syntax with no "\t\n" or repeated asm()?


Solution

  • GCC

    Your inline assembly is ill advised since you alter registers without informing the compiler. You should use GCC's extended inline assembler with proper input and output constraints. Using inline assembler should be used as a last resort and you should understand exactly what you are doing. GCC's inline assembly is very unforgiving, as code that seems to work may not even be correct.

    With that being said ending each string with \n\t makes the generated assembler code look cleaner. You can see this by compiling with the -S parameter to generate the corresponding assembly code. You do have the option of using a ; (semicolon). This will separate each instruction but will output all of the instructions on the same assembler line. And yes this matters: looking at the -S output is a good way to see how the compiler substituted operands into your asm template and put its own code around yours.

    Another option is to use C line continuation character \ (backslash). Although the following will generate excessive white space in generate assembly code it will compile and assemble as expected:

    int main()
    {
        __asm__("movl $4, %eax; \
                 movl $2, %ebx; \
                 addl %eax, %ebx"
              ::: "eax", "ebx");
    }
    

    Although this is a way of doing it, I'm not suggesting that this is good form. I have a preference for the form you use in your second example using \n\t without line continuation characters.


    Regarding splitting up multiple instructions into separate ASM statements:

    asm("movl $4, %eax");
    asm("movl $2, %ebx");     // unsafe, no operands specifying connections
    asm("addl %eax, %ebx");
    

    This is problematic. The compiler can reorder these relative to one another since they are basic assembler with no dependencies. It is possible for a compiler to generate this code:

    movl $4, %eax
    addl %eax, %ebx
    movl $2, %ebx
    

    This of course would not generate the result you expect. When you place all the instructions in a single ASM statement they will be generated in the order you specify.


    MSVC/C++

    32-bit Microsoft C and C++ compilers support an extension to the language that allows you to place multi-line inline assembly between __asm { and }. Using this mechanism you don't place the inline assembly in a C string; don't need to use line continuation; and no need to end a statement with with a ; (semicolon).

    An example of this would be:

    __asm {
        mov eax, 4
        mov ebx, 2
        add ebx, eax
    }