Search code examples
recursiontypestreesml

Modifying tuples after counting a tree in SML


I have a question I'm working on, where I have to traverse a tree once and count how many children each node has.

There are two parts to this, the datatype, and the function itself.


Datatype

The datatype requires that the internal nodes store a value of any type and have anywhere from 1-3 children. The leaves themselves store either a real number or a string list.

datatype leaf = Slist of string list | Real of real;
datatype tree = Empty
              | Leaf of leaf
              | Node of leaf * 'a tree
              | Node of leaf * 'a tree * 'a tree
              | Node of leaf * 'a tree * 'a tree * 'a tree

Function

Next, I have to define a function that takes a tree, and returns a tuple (n1, n2, n3), where n1 is the number of nodes having one child, n2 number of nodes with two children, n3 number of nodes with three children.

fun myChildren (t:'a tree) = childHelper (t, (0,0,0));


fun childHelper (t: 'a tree, (n1, n2, n3):int*int*int) =
  case t of
    Empty => 0
    | Node (x) => childrenHelper(t, (n1 + 1, n2, n3))
    | Node (x, y) => childrenHelper(t, (n1 + 1, n2 + 1, n3))
    | Node (x, y, z) => childrenHelper(t, (n1 + 1, n2 + 1, n3 + 1));

I'm just starting to use datatypes and cases, so it's confusing for me. I was wondering, are my datatypes the best way to represent the tree? And how do I make my function recursively go through the tree? As it stands, it just counts the first node of the tree instead of all of it. I'm also leaving off the case that it's a leaf, since at that point I don't need to do anything.

I know that in a list, you can do hd::tl, is there a way I can do that on the tree so that, having gone through the node, I call the function on each node?

For example, for Node (x, y, z), it should call childrenHelper on each node, but at the same time, maintain the number on the tuples. Any ideas on how to go forward with this?


Solution

  • First off, you cannot have a datatype with multiple constructors that are named the same. One will inevitably shadow over the other. Were you to run this code, you'd get a warning similar to:

    ! datatype tree = Empty
    !               | Leaf of leaf
    !               | Node of leaf * 'a tree
    !               | Node of leaf * 'a tree * 'a tree
    !               | Node of leaf * 'a tree * 'a tree * 'a tree
    ! Unguarded type variables at the top-level
    

    This is because the definition uses a type parameter 'a that isn't declared as part of the tree type. Changing this into datatype 'a tree = ..., you instead get this error:

    !                  | Node of leaf * 'a tree * 'a tree
    !                    ^^^^
    ! Illegal constructor specification: the constructor cannot be specified twice for the same datatype
    

    What you can do instead is have three different constructors, e.g.

    datatype 'a tree = Empty
                     | Node0 of leaf
                     | Node1 of leaf * 'a tree
                     | Node2 of leaf * 'a tree * 'a tree
                     | Node3 of leaf * 'a tree * 'a tree * 'a tree
    

    Are my datatypes the best way to represent the tree?

    Yes, your datatype is very fine.

    How do I make my function recursively go through the tree?

    You can go through a tree in different ways. See tree traversal and folding a tree.

    in a list, you can do hd::tl, is there a way I can do that on the tree so that, having gone through the node, I call the function on each node?

    You could create a paramorphic function alike a fold, but where the parameterised function takes the full node and not just the element in the node as argument:

    fun para f e t =
       let val e1 = f (e, t)
       in  case t of
                Empty                 => e1
              | Node0 x                => e1
              | Node1 (x, t1)         => para f e1 t1
              | Node2 (x, t1, t2)     => para f (para f e1 t1) t2
              | Node3 (x, t1, t2, t3) => para f (para f (para f e1 t1) t2) t3
       end
    

    Counting the number of nodes with 1, 2 and 3 subtrees is a specialization of this function:

    fun nodecount t =
        let fun helper ((one, two, three), t) =
                case t of
                     Empty   => e1
                   | Node0 _  => (one, two, three)
                   | Node1 _ => (one+1, two, three)
                   | Node2 _ => (one, two+1, three)
                   | Node3 _ => (one, two, three+1)
        in para helper (0,0,0) t end
    

    Edit: Yes, the datatype above is in fact redundant, since a tree with exactly one leaf x could be written as:

    val one_leaf = Node0 (x)
    val one_leaf = Node1 (x, Empty)
    val one_leaf = Node2 (x, Empty, Empty)
    val one_leaf = Node3 (x, Empty, Empty, Empty)
    

    If you remove Empty, this redundancy goes away, but you can no longer represent empty trees. A simple way to overcome this is by using two types:

    datatype 'a tree = Empty | NonEmpty of 'a tree_aux
    and 'a tree_aux = Node0 of leaf
                    | Node1 of leaf * 'a tree_aux
                    | Node2 of leaf * 'a tree_aux * 'a tree_aux
                    | Node3 of leaf * 'a tree_aux * 'a tree_aux * 'a tree_aux 
    

    Or if you prefer fewer constructors and a type composed of pre-existing ones:

    datatype leaf = Slist of string list | Real of real;
    datatype 'a tree = Empty | NonEmpty of 'a tree_aux
    and 'a tree_aux = Node of leaf * ('a tree_aux * ('a tree_aux * 'a tree_aux option) option) option
    

    but this is slightly bothersome.