Search code examples
haskelldata-structuresfunctional-programmingunfold

What is the correct definition of `unfold` for an untagged tree?


I've been thinking in how to implement the equivalent of unfold for the following type:

data Tree a = Node (Tree a) (Tree a) | Leaf a | Nil

It was not immediately obvious since the standard unfold for lists returns a value and the next seed. For this datatype, it doesn't make sense, since there is no "value" until you reach a leaf node. This way, it only really makes sense to return new seeds or stop with a value. I'm using this definition:

data Drive s a = Stop | Unit a | Branch s s deriving Show

unfold :: (t -> Drive t a) -> t -> Tree a
unfold fn x = case fn x of
    Branch a b -> Node (unfold fn a) (unfold fn b)
    Unit a     -> Leaf a
    Stop       -> Nil

main = print $ unfold go 5 where
    go 0 = Stop
    go 1 = Unit 1
    go n = Branch (n - 1) (n - 2)

While this seems to work, I'm not sure this is how it is supposed to be. So, that is the question: what is the correct way to do it?


Solution

  • If you think of a datatype as the fixpoint of a functor then you can see that your definition is the sensible generalisation of the list case.

    module Unfold where
    

    Here we start by definition the fixpoint of a functor f: it's a layer of f followed by some more fixpoint:

    newtype Fix f = InFix { outFix :: f (Fix f) }
    

    To make things slightly clearer, here are the definitions of the functors corresponding to lists and trees. They have basically the same shape as the datatypes except that we have replace the recursive calls by an extra parameter. In other words, they describe what one layer of list / tree looks like and are generic over the possible substructures r.

    data ListF a r = LNil | LCons a r
    data TreeF a r = TNil | TLeaf a | TBranch r r
    

    Lists and trees are then respectively the fixpoints of ListF and TreeF:

    type List a = Fix (ListF a)
    type Tree a = Fix (TreeF a)
    

    Anyways, hopping you now have a better intuition about this fixpoint business, we can see that there is a generic way of defining an unfold function for these.

    Given an original seed as well as a function taking a seed and building one layer of f where the recursive structure are new seeds, we can build a whole structure:

    unfoldFix :: Functor f => (s -> f s) -> s -> Fix f
    unfoldFix node = go
      where go = InFix . fmap go . node
    

    This definition specialises to the usual unfold on list or your definition for trees. In other words: your definition was indeed the right one.