Search code examples
x86bootloaderreal-mode

Choose stack pointer address in x86 real mode (alignment)


I have understood that one should align the stack pointer to a 2-byte boundary. In other words, one should not set SP to a value that ends in 0xF (or any other odd).

What happens then if I use 0xFFFF as SP? Is all 64kB usable, or one byte less?

If I want a stack size of 1024 bytes, should I set SP to 0x3FF or 0x400? e.g. is the byte in which the SS (stack segment) is pointing to going to be used?

They state here that one should also not use an SP address ending in 0xE, "wasting the bytes at 0x..E and 0x..F". How come?


Solution

  • The x86 stack is full descending.
    Full means that the stack pointer points to the last item pushed. This contrasts with empty descending/ascending stack where the stack pointer points to the next free location.

    Basically, this boils down to the semantic of push ax being

    sub sp, 02h
    mov WORD [sp], ax 
    

    When you set the stack pointer sp to the address X, X is considered the location of the last item pushed, hence it will not be used.
    If you set sp to 0xe, a push will move sp to 0xe - 2 = 0xc and write its operand there. The memory at 0xe and above is not touched.

    Using an odd address for sp impact the performance negatively because a misaligned memory access can have a latency as double as that of an aligned access.
    For quantities smaller that the DRAM bus width (at the time of writing is 8 bytes) this penalty is somewhat reduced.
    Considering how often the stack is used it is worth keeping it aligned.

    Starting with an odd address for sp will lead to troubles when the stack pointer reaches 1. A push will set sp to 0xffff but then writing a word there will trigger a #SS because the higher byte is outside the ss limit.
    Raising an exception with a messed up stack will, in turn, raise another #SS that the CPU will dispatch as a #DF.
    But the stack is still messed up so a third exception is generated, a triple fault, and the CPU will reset.
    So there is no benefit in having the stack pointer not aligned.

    If you want the stack of size S you set sp to S mod 216 granted that 2 <= S <= 64KiB.
    You can check this is right by writing down an example with a small value for S (say 4).
    You can also check that setting sp to 0 will give you a 64KiB stack which is the biggest size naturally available in real mode.