Search code examples
cclangx86-64inline-assemblyfunction-attributes

Forcing a function to be optimized on clang or prologueless non-naked C functions - paste together blocks of asm based on C constants


Is there anyway to force a C function on clang to be optimized even when the file is compiled with -O0?

I'm looking for something equivalent to gcc's __attribute((optimize("s")) or __attribute((optimize(3)).

(Related: In clang, how do you use per-function optimization attributes?)


What I'm trying to do is generate certain functions in almost pure assembly via a macro—the remaining C code in there shouldn't generate any assembly code. Ideally, the macro would use C-based integer constant expressions to choose which code to paste and writing static before it would make the generate function static. I also want no stack manipulation in the function's prologue.

On GCC something like:

enum { CONSTANT = 0 };
__attribute((optimize("Os"),noinline,noipa))
int foo(void){
    if (CONSTANT) asm("mov $1, %eax; ret;");
    else asm("xor %eax, %eax; ret;");
    __builtin_unreachable();
}

gets the gist of it successfully. On clang, the optimize attribute is unrecognized and a push %rbp; mov %rsp, %rbp prologue is generated which would break my real use case, as well as the ret in this toy example, so it's most undesirable.

On GCC, __attribute((naked)) also works to eliminate the prologue and disable inlining and Inter-Procedural Analysis (IPA), but clang hard-rejects it, enforcing the requirement that naked functions should only consist of pure assembly (no nongenerating C code, even).

Per the GCC docs for x86 function attributes:

naked

This attribute allows the compiler to construct the requisite function declaration, while allowing the body of the function to be assembly code. The specified function will not have prologue/epilogue sequences generated by the compiler. Only basic asm statements can safely be included in naked functions (see Basic Asm). While using extended asm or a mixture of basic asm and C code may appear to work, they cannot be depended upon to work reliably and are not supported.

While not supported, it was working well enough for my use-case. The hack with __attribute__((optimize("Os"),noinline,noipa)) is even more hacky but does in fact compile to the asm I want with current GCC. I'd like to do something similar with clang.


Solution

  • How about you put the selector and the alternatives into three separate functions, with the latter two marked with __attribute((naked)) that you say works? Something like this:

    enum { CONSTANT = 0 };
    __attribute((naked))
    int foo1(void){
        asm("mov $1, %eax; ret;");
    }
    __attribute((naked))
    int foo0(void){
        asm("xor %eax, %eax; ret;");
    }
    int foo(void){
        if (CONSTANT) return foo1();
        else return foo0();
    }