Search code examples
armcompiler-optimization32-bit

ARM Cortex A7: avoid memory veneers?


On ARMv7, which is Thumb capable, is it right that we can avoid all the veneers by using the BX instruction?

Since this instruction takes a 32 bit register, are we good?

If yes, when I see veneers in the generated code, I should specialize the output for my machine, right?

Thanks


Solution

  • Yes, since BX takes a 32-bit register, there's no need for veeners because you can cover the whole addressing space.

    Of course you'd need to load a 32-bit value into the register, which usually means constant pooling, so if you are looking to squeeze every cycle out of it and your program is not too large you're better off with relative branches. As @Notlikethat notes, if you don't already have the address in a register there's no point in using BX when you can just LDR PC, ... (unless you need to support ARMv4T interworking).

    Relative, non-conditional, 32-bit Thumb branches have a 24-bit addressing space, so you can reach +/- 16MB (for others see here). If you're doing ELF, be really careful with 16-bit relative Thumb branches. A 32-bit branch will generate a 24-bit relocation and the linker will insert a veener if the target can't be addressed with 24 bits. A 16-bit branch generates a 11-bit relocation and ELF for ARM specifies that the linker is not required to generate veeners for those, so you'd risk a link-time out-of-range branch.