Search code examples
haskellrecursiontreefold

What's the difference regarding evaluation order between `foldTree` and recursive versions?


 -- Top-down
treeDown' :: Tree a -> [a]
treeDown' (Node x xs) = x : concat (map treeDown' xs)
-- Bottom-up
treeUp' :: Tree a -> [a]
treeUp' = foldTree f
  where
    f x ns = x : concat ns

For me, both versions are equivalent in terms of:

  • output value
  • evaluation order (bottom-up)

Someone told me that in the foldTree version,

foldTree :: (a -> [b] -> b) -> Tree a -> b
foldTree f = go where
    go (Node x ts) = f x (map go ts)

f gets saturated after completed recursive descent over and return from the children and their subForests. And that's why it performing a "bottom-up" evaluation order and the other version is "top-down".

That's makes sense. However, I see the same is happening in treeDown' version. x can't be added to the list until map treeDown' xs evaluation is completed. Of course, due to lazyness, the actual evaluation order could change.

So, I am wondering, based solely on those definitions, aren't treeDown' and treeUp' exactly equivalent ?


Solution

  • I don't see why it matters when f is saturated. These functions are the same algorithm, and they process the tree in the same order. Because of laziness, that order is top down.

    To observe evaluation order in Haskell is tricky, again because of laziness. But GHCI gives us the tools to see what's going on, to see whether treeUp' and treeDown' behave identically.

    First, we define a way to build a small example tree, so that we can later check how much of that tree has been examined:

    mkTree :: Int -> Tree Int
    mkTree 0 = Node 0 []
    mkTree n = Node n [mkTree (n - 1)]
    

    Next we need a tool that computes a function's traversal of a fresh tree, prints the first item to force it, and then returns the whole traversal and the input tree:

    first :: (Tree Int -> [Int]) -> IO (Tree Int, [Int])
    first f = let t = mkTree 3
                  xs = f t 
              in print (head xs) *> pure (t, xs)
    

    Then call it on each of the two functions we'd like to test, saving the results:

    Prelude> down <- first treeDown'
    3
    Prelude> up <- first treeUp'
    3
    

    Observe that, as expected, each prints just 3, the root of the example tree. Does either of them do extra recursive work before putting the root of the tree at the head of the traversal? We can find out with GHCI's :sprint command, which shows you exactly how much of a value has been forced:

    Prelude> :sprint up
    up = (<Node> 3 [_],3 : _)
    Prelude> :sprint down
    down = (<Node> 3 [_],3 : _)
    

    And indeed, as I claimed, the two functions behave identically. Each only looks at the root of the input tree to compute the first item of the result list, leaving the rest of the tree unexplored until someone asks for another item.