Search code examples
recursionerlangtail-call-optimizationtail-call

Erlang, Last Call Optimization, lambda functions, and how to prevent growing a stack


I'm writing some Erlang code, and I've run into a bizarre situation that I don't understand.

The code:

-module(recursive_test).
-export([a/2]).

a(_, []) -> ok;
a(Args, [H|T]) ->
    F = fun() -> a(Args, T) end,
    io:fwrite(
        "~nH: ~p~nStack Layers: ~p",
        [H, process_info(self(), stack_size)]
    ),
    b(Args, F).

b(Args, F) ->
    case Args of
        true -> ok;
        false -> F()
    end.

The output:

(Erlang/OTP 20)
12> c(recursive_test).
{ok,recursive_test}
13> recursive_test:a(false, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]).
H: 1
Stack Layers: 28
H: 2
Stack Layers: 28
H: 3
Stack Layers: 28
H: 4
Stack Layers: 28
H: 5
Stack Layers: 28
H: 6
Stack Layers: 28
H: 7
Stack Layers: 28
H: 8
Stack Layers: 28
H: 9
Stack Layers: 28
H: 10
Stack Layers: 28
ok
14> recursive_test:a(false, [1, 2, 3, 4, 5, 6]).
H: 1
Stack Layers: 28
H: 2
Stack Layers: 28
H: 3
Stack Layers: 28
H: 4
Stack Layers: 28
H: 5
Stack Layers: 28
H: 6
Stack Layers: 28
ok

From what I understand from this article, Erlang uses a Last Call Optimization where, if the last thing a function does is to call another function, the BeamVM will instead jump the program counter to the start of the new function instead of pushing a new stack frame. Does this mean that in a pattern like the above, we're stomping around the heap instead of around the stack? Does this free the memory that was previously held in these functions (in the case of the above code, would we have one copy of function F allocated in memory at a time, or would we have many copies of the function F allocated in memory at a time)? Are there any negative consequences to using this pattern (other than the obvious debugging struggle)?


Solution

  • First, fun (lambda, closure or whatever you want to call it) due to immutable nature of Erlang could be and is implemented in a way you can imagine like tuple

    {fun, {Module, FuncRef, Arity, CodeVersion}, CapturedValues}
    

    So in your case, it will be something like

    {fun, {recursive_test, '-a/2-fun-0-', 2, 2248400...}, [false, [2,3,4|...]]}
    

    Note the arity is 2 because you have arity 0 of fun plus 2 captured values.

    Then it should be easier to understand what is going on in your code. Remember is not true tuple but some structure which behaves very similarly in regard to data consumption, heap term referencing, GC, transferring around the Erlang distribution protocol and so on.

    Which allows you to understand the second thing which is that calling F() inside b/2 is something like

    recursive_test:'-a/2-fun-0-'(false, [2,3,4|...])
    

    which can be nice tail call a.k.a jump. So your code is making fun "tuples" and then just jump around code. Each fun "tuple" is then not referenced anymore so could be GCed out anytime. I recommend you to try it with numbers instead of lists and try bigger and bigger numbers and watch memory consumption using process_info or observer. It would be a nice exercise.

    BTW you can use

    process_info(self(), stack_size)
    

    instead of slow and ugly

    roplists:get_value(stack_size, process_info(self()))