Search code examples
androidlinuxassemblyarmstack-pointer

ARM assembly. Is it safe to use r13 (stack pointer) as a general purpose register?


I'm writing an extremely optimized leaf function and to make it run faster I want to use R13 as a general purpose register. I preserve R13 by moving it to one of VFP registers before using it and before returning from function I restore it by moving it back. It looks like this:

/* Start of the function */
push { r4 - r12, r14 }
vmov s0, r13
/* Body of the function. Here I use R13
 * as a general purpose register */
vmov r13, s0
pop { r4 - r12, r14 }
bx lr

And it works. But I have read that some operating systems assume that R13 is always used as a stack pointer, and using it as a general purpose register can cause crashes. I should also say that this function is intended to run only on Android (Linux). Thanks!


Solution

  • Obviously you should only consider this if you're already using all the other GP registers, including lr, and can't shift some of your work to NEON registers, e.g. using packed-integer even if you only care about the low 32 bits.

    (Using SIMD regs for more scalar integer is usually only useful if there's an isolated set of values that don't interact with the other values in your algorithm, and you don't need to branch on them or use them as pointers. Transfer between int and SIMD is slow on some ARM CPUs.)

    This is very non-standard, and only even possibly safe in user-space, not kernel


    If you have any signal handlers installed, your stack pointer must be valid when one of those signals arrives. (And that's asynchronous.)

    There's no other async usage of the user-space stack pointer in Linux beyond signal handlers. (Except if you're debugging with GDB and use print foo(123) where foo is a function in the target process.)

    As mentioned in comments on Can I use rsp as a general purpose register (the x86-64 equivalent of this question), there's a workaround even for signals:

    Use sigaltstack to set up an alternative stack, and specify SA_ONSTACK in the flags for sigaction when installing a handler.

    As @Timothy points out, if your scratch value of SP could be an integer that happens to "point" into the alt stack, the signal dispatch mechanism will assume this is a nested signal and won't modify SP (because in an actual nested-signal case that would overwrite the first signal handler's still in use stack). So you could be one push away from SP going into an unmapped page, unless you allocate twice as much as you need, and only pass the top half to sigaltstack. (Maybe just 2k or 4k for simple signal handlers that return after not doing much).

    This should be safe even with nested signals: only the outer-most signal handler can start near the bottom of the alt stack, and use some of the allocated space beyond the actual altstack. Another signal will use space below that, if SP is still within the altstack. Or it will use the top of the altstack if SP has gotten outside the altstack.

    Or you can avoid the need for this over-allocation by using SP to hold a pointer to something else that's definitely not the alt stack, if any of your GP registers need to be a pointer. Having it be a valid pointer opens you up to corruption instead of faults if a debugger uses the current SP for something, or if you get the altstack mechanism wrong. But that's just a difference in failure mode: either is catastrophic.


    Hardware interrupts save state on the kernel stack, not the user-space stack. If they used the user stack:

    1. user-space could crash the OS by having an invalid SP.
    2. user-space could gain kernel privileges by having another user-space thread modify the kernel's stack data (including return addresses.)

    (All user-space threads of a process share the same page table, and can read/write each other's stack mappings.)

    Linux/Android is very different from a lightweight RTOS without virtual memory or strict enforcement of privilege separation.