Search code examples
chaskelltail-call-optimizationmultiple-languagesmutual-recursion

Compiling Tail-Call Optimization In Mutual Recursion Across C and Haskell


I'm experimenting with the foreign-function interface in Haskell. I wanted to implement a simple test to see if I could do mutual recursion. So, I created the following Haskell code:

module MutualRecursion where
import Data.Int

foreign import ccall countdownC::Int32->IO ()
foreign export ccall countdownHaskell::Int32->IO()

countdownHaskell::Int32->IO()
countdownHaskell n = print n >> if n > 0 then countdownC (pred n) else return ()

Note that the recursive case is a call to countdownC, so this should be tail-recursive.

In my C code, I have

#include <stdio.h>

#include "MutualRecursionHaskell_stub.h"

void countdownC(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownHaskell(count-1);
}

int main(int argc, char* argv[])
{
    hs_init(&argc, &argv);

    countdownHaskell(10000);

    hs_exit();
    return 0;
}

Which is likewise tail recursive. So then I make a

MutualRecursion: MutualRecursionHaskell_stub
    ghc -O2 -no-hs-main MutualRecursionC.c MutualRecursionHaskell.o -o MutualRecursion
MutualRecursionHaskell_stub:
    ghc -O2 -c MutualRecursionHaskell.hs

and compile with make MutualRecursion.

And... upon running, it segfaults after printing 8991. Just as a test to make sure gcc itself can handle tco in mutual recursion, I did

void countdownC2(int);

void countdownC(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownC2(count-1);
}

void countdownC2(int count)
{
    printf("%d\n", count);
    if(count > 0)
        return countdownC(count-1);
}

and that worked quite fine. It also works in the single-recursion case of just in C and just in Haskell.

So my question is, is there a way to indicate to GHC that the call to the external C function is tail recursive? I'm assuming that the stack frame does come from the call from Haskell to C and not the other way around, since the C code is very clearly a return of a function call.


Solution

  • I believe cross-language C-Haskell tail calls are very, very hard to achieve.

    I do not know the exact details, but the C runtime and the Haskell runtime are vastly different. The main factors for this difference, as far as I can see, are:

    • different paradigm: purely functional vs imperative
    • garbage collection vs manual memory management
    • lazy semantics vs strict one

    The kinds of optimizations which are likely to survive across language boundaries given such differences are next to zero. Perhaps, in theory, one could invent an ad hoc C runtime together with a Haskell runtime so that some optimizations are feasible, but GHC and GCC were not designed in this way.

    Just to show an example of the potential differences, assume we have the following Haskell code

    p :: Int -> Bool
    p x = x==42
    
    main = if p 42
           then putStrLn "A"     -- A
           else putStrLn "B"     -- B
    

    A possible implementation of the main could be the following:

    • push the address of A on the stack
    • push the address of B on the stack
    • push 42 on the stack
    • jump to p
    • A: print "A", jump to end
    • B: print "B", jump to end

    while p is implemented as follows:

    • p: pop x from the stack
    • pop b from stack
    • pop a from stack
    • test x against 42
    • if equal, jump to a
    • jump to b

    Note how p is invoked with two return addresses, one for each possible result. This is different from C, whose standard implementations use only one return address. When crossing boundaries the compiler must account for this difference and compensate.

    Above I also did not account for the case when the argument of p is a thunk, to keep it simple. The GHC allocator can also trigger garbage collection.

    Note that the above fictional implementation was actually used in the past by GHC (the so called "push/enter" STG machine). Even if now it is no longer in use, the "eval/apply" STG machine is only marginally closer to the C runtime. I'm not even sure about GHC using the regular C stack: I think it does not, using its own one.

    You can check the GHC developer wiki to see the gory details.