Search code examples
recursionf#stack-overflowtail-recursion

Avoid stackoverflow when recursively dismantling a string


I'm working on a solution for a problem for the advent of code 2018 (spoiler alert) where I need a function which takes a string (or a char list) and removes every pair of chars when they react. The exercise describes two chars, or "elements" in a "polymer", reacting when they are the same letter but only differ in case; so starting out with AbBc would leave you with Ac. Keep in mind that after a reaction two chars could end up next to each other, where they weren't before, and cause a new reaction.

I thought I could solve this by using a recursive function which only deals with the first two chars and recursively calls itself, but since the input string is quite large, this causes a stackoverflow exception:

let rec react polymer =
    match polymer with
    | [] -> []
    | [x] -> [x]
    | head::tail ->
        let left = head
        let right = List.head tail
        let rest =  List.tail tail
        // 'reacts' takes two chars and
        // returns 'true' when they react
        match reacts left right with
        // when reacts we go further with
        // the rest as these two chars are
        // obliterated
        | true -> react rest
        // no reaction means the left char
        // remains intact and the right one 
        // could react with the first char 
        // of the rest
        | false -> [left] @ react tail

Then, just trying to solve the exercise to have a right answer to unit test against, I tried to do it imperatively, but that got messy real quick and now I'm kinda stuck. I'm teaching myself f# so any pointers are welcome. Can anyone solve this in a functional manner?


Solution

  • You can avoid stack overflow by rewriting your function to use tail recursion, which just means that the recursive call should be the last operation to execute.

    When you do [left] @ react tail you first make a recursive call, and then append [left] to the result of that. That means it has to keep the current function context, called a stack frame, around while it executes the recursive call, and if that recurses as well the stack frames add up until you get a stack overflow. But if there's no more work to be done in the current function context, the stack frame can be released (or reused), hence no stack overflow.

    You can make it tail recursive by adding another function argument, conventionally called acc since it "accumulates" values. Instead of adding left to the return value of the recursive call we add it to the accumulator and pass that along. Then when we exhaust the input, we return the accumulator instead of the empty list.

    I've also taken the liberty of the append, [left] @ ..., as a cons, left::..., since the latter is much more efficient than the former. I've also moved left, right and rest to the pattern, since that's much neater and safer. You should generally avoid using List.head and List.tail since they fail on empty lists and are bugs just waiting to happen.

    let rec react acc polymer =
        match polymer with
        | [] -> acc
        | [x] -> x::acc
        | left::right::rest ->
            match reacts left right with
            | true -> react acc rest
            | false -> react (left::acc) (right::rest)
    

    You could also use a guard instead of nested matches (which should really have been an if anyway):

    let rec react acc polymer =
        match polymer with
        | [] ->
            acc
        | [x] ->
            x::acc
        | left::right::rest when reacts left right ->
            react acc rest
        | left::rest ->
            react (left::acc) rest