Search code examples
pythonalgorithmdata-structurescatalan

Enumerate all full (labeled) binary tree


I'm searching a practical algorithm for enumerating all full labeled binary tree.

A full binary tree is a tree where all internal nodes has a degree 3, the leaves has degree 1 and the root has a degree 2.

A labeled tree is a tree where all leaves has a unique label.

Example:

    *
    |\
    | \
    *  *
   /|  |\
  / |  | \
 T  C  D  F

Solution

  • From comments, it is clear that the question is to enumerate rooted unordered labelled full binary trees. As explained in this paper, the number of such trees with n labels is (2n-3)!! where !! is the double factorial function.

    The following python program is based on the recursive proof in the referenced paper; I think the code is straight-forward enough that it will pass as an explanation of the algorithm:

    # A very simple representation for Nodes. Leaves are anything which is not a Node.
    class Node(object):
      def __init__(self, left, right):
        self.left = left
        self.right = right
    
      def __repr__(self):
        return '(%s %s)' % (self.left, self.right)
    
    # Given a tree and a label, yields every possible augmentation of the tree by
    # adding a new node with the label as a child "above" some existing Node or Leaf.
    def add_leaf(tree, label):
      yield Node(label, tree)
      if isinstance(tree, Node):
        for left in add_leaf(tree.left, label):
          yield Node(left, tree.right)
        for right in add_leaf(tree.right, label):
          yield Node(tree.left, right)
    
    # Given a list of labels, yield each rooted, unordered full binary tree with
    # the specified labels.
    def enum_unordered(labels):
      if len(labels) == 1:
        yield labels[0]
      else:
        for tree in enum_unordered(labels[1:]):
          for new_tree in add_leaf(tree, labels[0]):
            yield new_tree
    

    For n == 4, there are (2*4 - 3)!! == 5!! == 1 * 3 * 5 == 15 trees:

    >>> for tree in enum_unordered(("a","b","c","d")): print tree
    ... 
    (a (b (c d)))
    ((a b) (c d))
    (b (a (c d)))
    (b ((a c) d))
    (b (c (a d)))
    (a ((b c) d))
    ((a (b c)) d)
    (((a b) c) d)
    ((b (a c)) d)
    ((b c) (a d))
    (a (c (b d)))
    ((a c) (b d))
    (c (a (b d)))
    (c ((a b) d))
    (c (b (a d)))
    

    Another possible interpretation of the question was that it sought an enumeration of rooted ordered full binary trees with a specified list of labels. The number of such trees with n leaves is given by Cn-1, from the Catalan number sequence.

    def enum_ordered(labels):
      if len(labels) == 1:
        yield labels[0]
      else:
        for i in range(1, len(labels)):
          for left in enum_ordered(labels[:i]):
            for right in enum_ordered(labels[i:]):
              yield Node(left, right)
    

    For 5 labels, we have C5-1 == 14:

    >>> for tree in enum_ordered(("a","b","c","d", "e")): print tree
    ... 
    (a (b (c (d e))))
    (a (b ((c d) e)))
    (a ((b c) (d e)))
    (a ((b (c d)) e))
    (a (((b c) d) e))
    ((a b) (c (d e)))
    ((a b) ((c d) e))
    ((a (b c)) (d e))
    (((a b) c) (d e))
    ((a (b (c d))) e)
    ((a ((b c) d)) e)
    (((a b) (c d)) e)
    (((a (b c)) d) e)
    ((((a b) c) d) e)