I can't seem to wrap my head around recursion in Assembly Language. I understand how it works in higher level languages, but I don't understand how it is possible in assembly when the return value cannot be passed directly to the function.
I'm trying to make a recursive factorial function in AVR, but I don't understand how the stack passes the value when factorial requires n * (n-1), requiring both n and n-1 simultaneously
Using addition instead of multiplication
unsigned int accumulate(unsigned int n)
{
if(n) return(n+accumulate(n-1));
return(1);
}
and a different instruction set, perhaps easier to follow
00000000 <accumulate>:
0: e3500000 cmp r0, #0
4: 0a000005 beq 20 <accumulate+0x20>
8: e3a03000 mov r3, #0
c: e0833000 add r3, r3, r0
10: e2500001 subs r0, r0, #1
14: 1afffffc bne c <accumulate+0xc>
18: e2830001 add r0, r3, #1
1c: e12fff1e bx lr
20: e3a00001 mov r0, #1
24: e12fff1e bx lr
In this case the compiler didnt actually call the function, it detected what was going on and just made a loop.
Since there is nothing magic about recursion there is no difference in whether you call the same function or some other function.
unsigned int otherfun ( unsigned int );
unsigned int accumulate(unsigned int n)
{
if(n) return(n+otherfun(n-1));
return(1);
}
00000000 <accumulate>:
0: e92d4010 push {r4, lr}
4: e2504000 subs r4, r0, #0
8: 03a00001 moveq r0, #1
c: 0a000002 beq 1c <accumulate+0x1c>
10: e2440001 sub r0, r4, #1
14: ebfffffe bl 0 <otherfun>
18: e0800004 add r0, r0, r4
1c: e8bd4010 pop {r4, lr}
20: e12fff1e bx lr
so this shows how it works. Instead of using the stack to store the sum, the cheaper solution if you have the registers is to use a non-volatile register save that register to the stack then use that register during the funciton, depends on how many registers you have and how many local intermediate values you need to track. So r4 gets a copy of n coming in, then that is added (for factorial it is a multiply which depending on the instruction set and code can produce a lot more code that can confuse the understanding so I used addition instead) to the return value from the call to the next function (with recursion where the compiler didnt figure out what we were doing this would have been a call to ourselves, and we can write this asm and make it a call to ourselves to see how it works)
Then the function returns the sum.
If we assume that otherfun is really accumulate we enter this function with a 4 lets say
00000000 <accumulate>:
0: e92d4010 push {r4, lr}
4: e2504000 subs r4, r0, #0
8: 03a00001 moveq r0, #1
c: 0a000002 beq 1c <accumulate+0x1c>
10: e2440001 sub r0, r4, #1
14: ebxxxxxx bl accumulate
18: e0800004 add r0, r0, r4
1c: e8bd4010 pop {r4, lr}
20: e12fff1e bx lr
r4 and lr are saved on the stack (call this r4-4 and lr-4)
r4 = n (4)
r0 = n-1 (3)
call accumulate with n-1 (3)
r4 (4) and lr are saved on the stack (r4-3, lr-3) lr now points back into
r4 = n (3)
r0 = n-1 (2)
call accumulate with n-1 (2)
r4 (3) and lr are saved on the stack (r4-2, lr-2)
r4 = n (2)
r0 = n-1 (1)
call accumulate with n-1 (1)
r4 (2) and lr are saved on the stack (r4-1, lr-1)
r0 = n-1 (0)
call accumulate with n-1 (0)
now things change...
r0 = 1
return to lr-1 which is into accumulate after the call to accumulate
r4 gets 2 from the stack
r0 (1) = r0 (1) + r4 (2) = 3
return to lr-2 which is into accumulate r4 gets 3 from the stack
r0 (3) = r0 (3) + r4 (3) = 6
return to lr-3 which is into accumulate r4 gets 4 from the stack
r0 (6) = r0 (6) + r4 (4) = 10
return to lr-4 which is the function that called accumulate r4 is restored
to what it was before accumulate was first called, r4 is non-volatile you have to for this instruction set return r4 the way you found it (as well
as others, but we didnt modify those)
so the addition in this case multiplication in your desired case is
result = 1 + 2 + 3 + 4
How that happened is we basically pushed n on the stack then called the function with n-1. In this case we push 4, 3, 2, 1 then we start to unwind that and each return processes 1 then 2 then 3 then 4 as it returns taking those from the stack essentially.
the bottom line is you dont have to care about recursion to support recursion simply use an abi that supports recursion, which is not hard to do, then hand code the instructions in assembly as if you were the compiler
Maybe this makes it easier to see. n coming in is both a parameter coming in but also for the duration of the function it is a local variable, local variables go on the stack.
unsigned int accumulate(unsigned int n)
{
unsigned int m;
m = n;
if(n) return(m+accumulate(n-1));
return(1);
}
back to this
unsigned int accumulate(unsigned int n)
{
if(n) return(n+accumulate(n-1));
return(1);
}
so independent of the instruction set
accumulate:
if(n!=0) jump over
return_reg = 1
return
over:
push n on the stack
first parameter (stack or register) = n - 1
call accumulate
pop or load n from the stack
return_reg = return_reg + n
clean stack
return
And also deal with return addresses for the instruction set if required. The ABI may use the stack to pass parameters or registers.
If I didnt follow the arm abi I could implement
accumulate:
cmp r0,#0
bne over
mov r0,#1
bx lr
over:
push {lr}
push {r0}
sub r0,#1
bl accumulate
pop {r1}
add r0,r0,r1
pop {lr}
bx lr
for grins an instruction set that uses the stack for most things not registers
00000000 <_accumulate>:
0: 1166 mov r5, -(sp)
2: 1185 mov sp, r5
4: 10a6 mov r2, -(sp)
6: 1d42 0004 mov 4(r5), r2
a: 0206 bne 18 <_accumulate+0x18>
c: 15c0 0001 mov $1, r0
10: 1d42 fffc mov -4(r5), r2
14: 1585 mov (sp)+, r5
16: 0087 rts pc
18: 1080 mov r2, r0
1a: 0ac0 dec r0
1c: 1026 mov r0, -(sp)
1e: 09f7 ffde jsr pc, 0 <_accumulate>
22: 6080 add r2, r0
24: 65c6 0002 add $2, sp
28: 1d42 fffc mov -4(r5), r2
2c: 1585 mov (sp)+, r5
2e: 0087 rts pc
it does a stack frame thing
gets the n parameter from the stack
saves that n parameter to the stack
compares and branches if not zero
in the if zero case we set the return value to 1
clean up the stack and return
now in the if not zero case
make the first parameter n-1
call a function (ourself)
do the addition and return