On some compilers (I'm mostly interested in GCC/clang) with void *jmp = &&label
you can get the address of a label you can later goto jmp
. For an interpreter with pointer tagging, I want to align those labels to the next 8 byte boundary, so the label values all have 0b…000. Usually, you can align the instruction position with assembly, i.e. __asm__ volatile(".p2align 8");
will pad the next instruction to the 2^8
byte with e.g. NOP
s. However, because the interpreter won't have code between jump blocks, the assembly gets optimized out (or something – even with volatile
it doesn't appear in the output.)
void test() {
void *next = &&LABEL_A;
goto *next;
/* won't get reached – optimized away? */
__asm__ volatile(".p2align 8");
LABEL_A:
goto *next;
}
On godbolt it shows the compiler doesn't align LABEL_A
. However, when we change the code to
void test() {
unsigned long foo = (unsigned long)&&LABEL;
void *next = &&LABEL_A;
goto *next;
LABEL:
__asm__ volatile(".p2align 8");
LABEL_A:
goto *next;
}
The output will show that LABEL_A
got correctly padded – presumably because LABEL
and the following statements couldn't get optimized away. However, this seems extremely fragile and doesn't work with -O3
. Is there a better way?
I tried several __attribute__(aligned(8), used)
attributes, volatile
values/accesses between the blocks, though only the LABEL
reference prevented the code elimination.
For GCC, you should use the -falign-labels=8
or the weaker -falign-jumps=8
options.
You could put them in a function attribute: __attribute__((optimise ("-falign-labels=8"))
.