Search code examples
functionrecursionassemblymips

How a recursive function works in MIPS?


I'm a newbie in MIPS (as I started learning MIPS assembly for my college) and I've got a problem in understanding how a recursive function works in MIPS.

For example, I've got this program (in C) to write it in MIPS:

int fact (int n)
{
   if (n < 1) return 0;
   else return n * fact(n - 1);
}

Can someone help me, with this or another example of a recursive function and explain me how it works?


Solution

  • The first thing I'd like to share is that the complexity in translating this into MIPS comes from the presence of mere function calling, rather than because recursion is involved — that fact is recursive is IMHO a red herring.  To this end, I'll illustrate a non-recursive function that has every bit the complexity of the recursive function you've stated:

    int fact (int n)
    {
       if (n < 1) return 0;
       else return n * other(n - 1);    // I've changed the call to "fact" to function "other"
    }
    

    My alteration is no longer recursive!  However the MIPS code for this version will look identical to the MIPS code for your fact (with the exception, of course, that the jal fact which changes jal other).  This is meant to illustrate that the complexity in translating this is due to the call within the function, and has nothing to do with who is being called.  (Though YMMV with optimization techniques.)


    To understand function calling, you need to understand:

    • the program counter: how the program interacts with the program counter, especially, of course in the context of function calling..
    • parameter passing
    • register conventions, generally

    In C, we have explicit parameters.  These explicit parameter, of course, also appear in assembly/machine language — but there are also parameters passed in machine code that are not visible in C code.  Examples of these are the return address value, and the stack pointer.


    What is needed here is an analysis of the function (independent of recursion):

    The parameter n will be in $a0 on function entry.  The value of n is required after the function call (to other), because we cannot multiply until that function call returns the right hand operand of *.

    Therefore, n (the left hand operand to *) must survive the function call to other, and in $a0 it will not — since our own code will repurpose $a0 in order to call other(n-1), as n-1 must go into $a0 for that.

    Also, the (in C, implicit) parameter$ra holds the return address value needed to return to our caller.  The call to other will, similarly, repurpose the $ra register, wiping out its previous value.

    Therefore, this function (yours or mine) needs two values to survive the function call that is within its body (e.g. the call to other).

    The solution is simple: values we need (that are living in registers that are repurposed or wiped out by something we're doing, or the callee potentially does) need to be moved or copied elsewhere: somewhere that will survive the function call.

    Memory can be used for this, and, we can obtain some memory for these purposes using the stack.

    Based on this, we need to make a stack frame that has space for the two things we need (and would otherwise get wiped out) after calling other.  The entry $ra must be saved (and later reloaded) in order for us to use it to return; also, the initial n value needs to be saved so we can use it for the multiply.  (Stack frames are typically created in function prologue, and removed in function epilogue.)


    As is often the case in machine code (or even programming in general) there are also other ways of handling things, though the gist is the same.  (This is a good thing, and an optimizing compiler will generally seek the best way given the particular circumstances.)


    Presence or absence of recursion does not change the fundamental analysis we need to translate this into assembly/machine language.  Recursion dramatically increases the potential for stack overflow, but otherwise does not change this analysis.


    Addendum

    To be clear, recursion imposes the requirement to use a dynamically expandable call stack — though all modern computer systems provide such a stack for calling, so this requirement is easy to forget or gloss over on today's systems.

    For programs without recursion, a call stack is not a requirement — local variables can be allocated to function-private global variables (including the return address), and this was done on certain older systems like the PDP-8, which did not offer specific hardware support for a call stack.


    Systems that use stack memory for passing parameters and/or are register poor may not require the analysis described in this answer, since variables are already being stored in memory that survives nested function calls.

    It is the partitioning of registers on modern register-rich machines that creates the requirement for the above analysis.  These register-rich machines pass parameters and return values (mostly) in CPU registers, which is efficient but imposes the need to sometimes make copies as registers are repurposed from one function to another.