Search code examples
cclangcompiler-optimizationinline-assemblymemory-alignment

Aligning label values in dead code with gcc/clang


On some compilers (I'm mostly interested in GCC/clang) with void *jmp = &&label you can get the address of a label you can later goto jmp. For an interpreter with pointer tagging, I want to align those labels to the next 8 byte boundary, so the label values all have 0b…000. Usually, you can align the instruction position with assembly, i.e. __asm__ volatile(".p2align 8"); will pad the next instruction to the 2^8 byte with e.g. NOPs. However, because the interpreter won't have code between jump blocks, the assembly gets optimized out (or something – even with volatile it doesn't appear in the output.)

void test() {
    void *next = &&LABEL_A;
    goto *next;

/* won't get reached – optimized away? */
__asm__ volatile(".p2align 8");
LABEL_A:
    goto *next;
}

On godbolt it shows the compiler doesn't align LABEL_A. However, when we change the code to

void test() {
    unsigned long foo = (unsigned long)&&LABEL;
    void *next = &&LABEL_A;
    goto *next;
LABEL:
__asm__ volatile(".p2align 8");
LABEL_A:
    goto *next;
}

The output will show that LABEL_A got correctly padded – presumably because LABEL and the following statements couldn't get optimized away. However, this seems extremely fragile and doesn't work with -O3. Is there a better way?

I tried several __attribute__(aligned(8), used) attributes, volatile values/accesses between the blocks, though only the LABEL reference prevented the code elimination.


Solution

  • For GCC, you should use the -falign-labels=8 or the weaker -falign-jumps=8 options.

    You could put them in a function attribute: __attribute__((optimise ("-falign-labels=8")).