Usage of __attribute((interrupt)) of arm-none-eabi-gcc for exception handlers

Context is: bare-metal development on STM32H753 (cortex M7) with arm-none-eabi-gcc

There is a compilation option __attribute((interrupt)) dedicated to IRQ handlers:

Doing some test, I confirmed the only difference is that the stack is aligned on 8 bytes (ie. the 3 least signficant bits of the stack pointer adress are zeroed before the function pushes on the stack).

What I don't understand is:

why is the stack alignement different in thread mode and in handler mode ?
why is this attribute not used in STMicro HAL and other sample code I've found ?

Solution

In Cortex-M the "interrupt" attribute doesn't make any difference. Cortex-M is built in such a way that interrupt handlers are just regular C functions, and don't require any special function prologue/epilogue like some other architectures do. Therefore, you don't need to use this attribute at all, and HAL doesn't use it.

ARMv7-M recommends to keep stack 8-byte (2 word, 64-bit) aligned at all times, but it doesn't force it. If you push or pop just 1 word at a time, it will work perfectly ok. Nevertheless, such is the recommendation. So if you write a piece in assembly, it's considered a good practice to push/pop an even number of registers at a time, but it's not strictly forced, and to be honest I've never had a situation where it would matter in any way at all. Nothing in the docs actually prohibits it. As a pure speculation, it could be due to internal AHB bus being 64-bit wide, but I know too little about how it works down on that level.

When you're in thread mode, and an interrupt occurs, Cortex-M automatically stacks R0-R3, R12, LR, PC (of the next instr.) and xPSR without any instructions in the code to do so. Which is exactly why you don't need an "interrupt" attribute, and why Cortex-M interrupt handlers are basic C functions - the registers automatically stacked are basically the same as caller-saved registers in regular C-code thread. Except that stacking/unstacking happens automatically in hardware. So by the time you enter interrupt handler, you have all caller-saved registers already saved on stack, and if you were using dedicated thread stack pointer, then it will switch to main stack pointer in the interrupts. If at the moment of interrupt your thread (or other interrupt that will be interrupted) had stack 4-byte aligned and not 8, the automatic stacking mechanism will push one extra dummy register on stack, and it will be thrown out when unstacking. Again, no user action required.