Search code examples
pythonclassif-statementbinary-treeself

Creating Binary Tree


Most of the questions I've searched for regarding binary trees shows the implementation of binary search trees, but not binary trees. The terms of a complete binary tree are:

  • Either an empty tree or it has 1 node with 2 children, where each child is another Binary Tree.
  • All levels are full (except for possibly the last level)
  • All leaves on the bottom-most level are as far left as possible.

I've come up with a concept but it doesn't seem to running through the recursion properly -- Does anyone know what I'm doing wrong?

class Node():
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None

    def add(self, key): 
        if self.key: 
            if self.left is None: 
                self.left = Node(key) 
            else: 
                self.left.add(key)

                if self.right is None: 
                    self.right = Node(key) 
                else: 
                    self.right.add(key) 
        else: 
            self.key = key
        return (self.key)

Solution

  • The problem in your code is that you are adding the same value multiple times. You add the node, and then still recurse deeper, where you do the same.

    The deeper problem is that you don't really know where to insert the node before you have reached the bottom level of the tree, and have detected where that level is incomplete. Finding the correct insertion point may need a traversal through the whole tree... which is defeating the speed gain you would expect to get from using binary trees in the first place.

    I provide here three solutions, starting with the most efficient:

    1. Using a list as tree implementation

    For complete trees there is a special consideration to make: if you number the nodes by level, starting with 0 for the root, and within each level from left to right, you notice that the number of a node's parent is (k-1)/2 when its own number is k. In the other direction: if a node with number k has children, then its left child has number k*2+1, and the right child has a number that is one greater.

    Because the tree is complete, there will never be gaps in this numbering, and so you could store the nodes in a list, and use the indexes of that list for the node numbering. Adding a node to the tree now simply means you append it to that list. Instead of a Node object, you just have the tree list, and the index in that list is your node reference.

    Here is an implementation:

    class CompleteTree(list):
        def add(self, key):
            self.append(key)
            return len(self) - 1
    
        def left(self, i):
            return i * 2 + 1 if i * 2 + 1 < len(self) else -1
    
        def right(self, i):
            return i * 2 + 2 if i * 2 + 2 < len(self) else -1            
    
        @staticmethod
        def parent(i):
            return (i - 1) // 2
    
        def swapwithparent(self, i):
            if i > 0:
                p = self.parent(i)
                self[p], self[i] = self[i], self[p]
    
        def inorder(self, i=0):
            left = self.left(i)
            right = self.right(i)
            if left >= 0:
                yield from self.inorder(left)
            yield i
            if right >= 0:
                yield from self.inorder(right)
    
        @staticmethod
        def depth(i):
            return (i + 1).bit_length() - 1
    

    Here is a demo that creates your example tree, and then prints the keys visited in an in-order traversal, indented by their depth in the tree:

    tree = CompleteTree()
    tree.add(1)
    tree.add(2)
    tree.add(3)
    tree.add(4)
    tree.add(5)
    for node in tree.inorder():
        print("  " * tree.depth(node), tree[node])
    

    Of course, this means you have to reference nodes a bit different from when you would use a real Node class, but the efficiency gain pays off.

    2. Using an extra property

    If you know how many nodes there are in a (sub)tree, then from the bit representation of that number, you can know where exactly the next node should be added.

    For instance, in your example tree you have 5 nodes. Imagine you want to add a 6 to that tree. The root node would tell you that you currently have 5 and so you need to update it to 6. In binary that is 110. Ignoring the left-most 1-bit, the rest of the bits tell you whether to go left or right. In this case, you should go right (1) and then finally left (0), creating the node in that direction. You can do this iteratively or recursively.

    Here is an implementation with recursion:

    class Node():
        def __init__(self, key):
            self.key = key
            self.left = None
            self.right = None
            self.count = 1
    
        def add(self, key):
            self.count += 1
            if self.left is None:
                self.left = Node(key)
            elif self.right is None:
                self.right = Node(key)
            # extract from the count the second-most significant bit:
            elif self.count & (1 << (self.count.bit_length() - 2)):
                self.right.add(key)
            else:
                self.left.add(key)
    
        def inorder(self):
            if self.left:
                yield from self.left.inorder()
            yield self
            if self.right:
                yield from self.right.inorder()
    
    tree = Node(1)
    tree.add(2)
    tree.add(3)
    tree.add(4)
    tree.add(5)
    for node in tree.inorder():
        print(node.key)
    

    3. Without extra property

    If no property can be added to Node objects, then a more extensive search is needed to find the right insertion point:

    class Node():
        def __init__(self, key):
            self.key = key
            self.left = None
            self.right = None
    
        def newparent(self):
            # Finds the node that should serve as parent for a new node
            # It returns a tuple:
            #   if parent found: [-1, parent for new node]
            #   if not found: [height, left-most leaf]
            # In the latter case, the subtree is perfect, and its left-most  
            # leaf is the node to be used, unless self is a right child 
            # and its sibling has the insertion point.
            if self.right:
                right = self.right.newparent()
                if right[0] == -1: # found inbalance
                    return right
                left = self.left.newparent()
                if left[0] == -1: # found inbalance
                    return left
                if left[0] != right[0]:
                    return [-1, right[1]] # found inbalance
                # temporary result in perfect subtree
                return [left[0]+1, left[1]]
            elif self.left:
                return [-1, self] # found inbalance
            # temporary result for leaf
            return [0, self]
    
        def add(self, key):
            _, parent = self.newparent()
            if not parent.left:
                parent.left = Node(key)
            else:
                parent.right = Node(key)
    
        def __repr__(self):
            s = ""
            if self.left:
                s += str(self.left).replace("\n", "\n  ")
            s += "\n" + str(self.key)
            if self.right:
                s += str(self.right).replace("\n", "\n  ")
            return s
    
    tree = Node(1)
    tree.add(2)
    tree.add(3)
    tree.add(4)
    tree.add(5)
    print(tree)
    

    This searches recursively the tree from right to left, to find the candidate parent of the node to be added.

    For large trees, this can be improved a bit, by doing a binary-search among paths from root to leaf, based on the length of those paths. But it will still not be as efficient as the first two solutions.