Search code examples
assemblyparameter-passingmipscalling-convention

Why is the order of arguments on stack is in such order that the first argument would be on the lowest adress, second on second lowest and so on


My computer architecture professor asked us to explain why the arguments are put on stack in reverse order.

What I mean by this is lets say when we want to call a plain function that takes in 2 integers, so we expand our stack by at least 24 bytes so that we make a home section for registers a0 - a3 and so that we make enough space to store the value of ra register and also align the stack so that the sp is on adress that is multiple of 8.

Why is it that register a0 is at sp + 0, register a1 at sp + 4 and so on?

The only thing I thought of is that it is just pure convention , but no reason to ask me the reason behind the reverse order of arguments if it is just pure convention...


Solution

  • If the calling convention passes args in registers, you don't need to store them to the stack at all. In the standard MIPS convention, the caller reserves "home space" (sometimes called "shadow space") before a call, but it's the callee that chooses how to use it.

    What you're calling "reverse order" isn't really reversed when you look at them as an array of args. If there are more than 4 args, the caller will have stored the 5th and later on the stack before the call. The callee can dump the register args to the home space to produce a contiguous array with the first (left-most) arg at the lowest address, continuing on into args the caller passed on the stack directly above the home space.

    A compiler making a debug build will typically store its args to memory that way. But it can choose to do whatever it wants, especially in an optimized build, and so can hand-written asm. At least for non-variadic functions; functions like printf that need to iterate their args in order typically do just store the register args to the home space, not including the "fixed" args like the first arg being the format string.

    If you need to store your args anywhere, the "standard" home-space slots are a good choice.

    C calling conventions enumerate args from left to right. So the left-most arg goes in the first arg-passing register, and so on. (Fun fact: Pascal did the opposite. In Pascal conventions, the leftmost arg would be at the highest address. On a machine like x86 with push instructions, a calling convention with no register args could do this by pushing args in left to right order, so the last one would be at the lowest address. The stack grows downward on most machines, including x86 and the standard MIPS calling convention.)

    Left-most arg at a consistent place (like $a0) is necessary for variadic functions: ISO C requires printf("hello\n", 1, 2, 3, 4); to safely ignore args1 not referenced by the format string. printf doesn't ever need to know there is a 5th arg (with value 4) that the caller put on the stack in the standard 4-register convention MIPS uses. But it does need to find the format string. (And snprintf or fprintf have more non-variadic args).


    Fun fact: MIPS doesn't need "home space" to let functions make a contiguous array of args (useful for variadic functions). jal doesn't modify $sp, so the callee could make space right below the stack args to store the incoming register args. You wouldn't call this "home space" since it's not special in terms of debuggers looking for incoming arg variables in stack frames. And it wouldn't save instructions for tiny leaf functions (that failed to inline for whatever reason), although a red zone (safe area below $sp) would be equally good for that.

    This is different in Windows x64 for example, where the call instruction pushes a return address on the stack, so it is important that the caller reserve home space before the call so the callee can make a contiguous array of args, simplifying implementation of variadic functions compared to conventions that don't do that. (Like x86-64 System V.)


    The standard MIPS convention does have the caller reserve the home space, and leaves it to the caller to clean it up. We can see that from looking at compiler output:

    void bar();
    
    void foo(void)
    {         // reserves 32 bytes, home space + room to save/restore the return address
        bar();
        bar();
        return 1;   // make this *not* a tail-call
    }
    
    int use_home_space(volatile int x, volatile int y)
    {
        // volatile function args get spilled even with optimization enabled.
        ++x;
    }
    

    Compiled with GCC on Godbolt:

    # GCC11.2 -O2 -march=mips3 -fno-delayed-branch
    #   the default is mips1 which needs a load-delay 
    foo:
            addiu   $sp,$sp,-32       # reserve 32 bytes, home space + return address
                                      # padded to keep SP 16-byte aligned?
            sw      $31,28($sp)       # save return address
            jal     bar
            nop
    
            lw      $31,28($sp)
            addiu   $sp,$sp,32
            j       bar                
            nop
    

    GCC could have saved its return address into the incoming home space and only allocated 16 bytes in this function, not 32. IDK if that's just a missed optimization, or if there are ABI reasons not to do that.

    use_home_space:
            sw      $4,0($sp)     # store x ($a0) at the lowest address
            sw      $5,4($sp)     # store y ($a1) at the 2nd home-space slot
    
            lw      $3,0($sp)
            addiu   $3,$3,1       # ++x
            sw      $3,0($sp)
    
            jr      $31           # return
            nop
    

    Footnote 1: The requirement to safely ignore excess args also means callee-pops conventions can't be used for variadic functions.

    For example 32-bit x86 stdcall isn't usable; x86 C implementations that default to stdcall for most functions use cdecl for variadic functions.

    The only standard convention on MIPS I know of is the one GCC uses, which is caller-pops, i.e. cleans up the home space and any stack args it allocated space for. The callee returns with SP having the same value as it did on entry.

    This makes a lot of sense for MIPS; x86 has stack instructions like push/pop, and many legacy x86 calling conventions don't pass any args in registers, so it's normal to push/push/call or something, and then have to undo those 2 pushes after each call returns. Or use a plain store (x86 mov) instead of push to set args for the next call. And x86 even has a special form of return instruction that pops a return address as usual, and then adds an immediate to the stack pointer to pop n more bytes of space.

    MIPS has none of those features that make callee-pops attractive or useful. It's more efficient to just sw args into stack memory than to also addiu $sp, $sp, -4 before each store to make it a "push". But on x86, push eax is a 1-byte instruction vs. mov [esp+4], eax is multiple bytes. (And on modern x86 both are almost equally fast.) So on MIPS a function that makes multiple function calls just allocates enough stack space for the largest arg area it's going to need (including home space), and sets registers before each call and maybe some sw instructions. $sp doesn't move until the function returns (except inside callees of course, or if we do an alloca...)

    Having a callee deallocate the home space would mean you'd have to allocate it again for the next call. (Windows x64 which also has home space is similarly a caller-pops convention, for similar reasons. Even when functions more than 4 args on Windows x64, it's normal to set the stack args with plain mov stores, not push.)

    And non-leaf MIPS functions also need to store/reload their return address. If they reserve stack space for that like GCC does (instead of using their own home space allocated by their caller), then a callee deallocating its own home space wouldn't even undo all of a caller's allocation, so the caller would still need to modify $sp again before returning.

    So a callee-pops convention on MIPS, especially in a convention with home space like the standard one, would be counter-productive, costing more instructions in function epilogues, and at most call sites. Not usually saving anything.