Search code examples
pythonalgorithmtreeternary-tree

Majority tree evaluation


Consider a complete ternary tree of depth h, composed of a root attached to three complete ternary trees of depth h - 1. There are n = 3^h leaves and each leaf has a boolean value associated to it. Each internal node, including the root, equals the value of the majority of its children.

Here is an example of a tree of depth 2:

e

Given the leaves input vector [0, 0, 1, 0, 1, 0, 1, 1, 1], we would like to find the root of the tree. In order to find the root, we could evaluate all the leaves and internal nodes (i.e. 3^h operations). But we may be able to evaluate fewer nodes. In the example above, we can see that the value of the first internal node (most left) can be evaluated after examining its first two children. Similarly at depth = 1, the first two nodes are sufficient to find the root of the tree.

I have been thinking over this problem but I can't find a good way to tackle the problem.

import numpy as np
import random

class node:
def __init__(self):
    self.low = None
    self.mid = None
    self.high = None

def put(self, low, mid, high):
    self.low = low
    self.mid = mid
    self.high = high
    return self

class ternary_tree:

def __init__(self, leaves, count= 0):
    self.leaves = leaves
    self.root = node()
    self.value = None
    self.count = count

def get_root(self):
    self.root.put(self.leaves[0], self.leaves[1], self.leaves[2])
    self.value = majority(self.root)
    return self.value

def majority(node):
    global ops_count
    r1, r2, r3 = random.sample([node.low, node.mid, node.high], 3)
    if r1 > 0:
        ops_count += 1
        if r2 > 0:
            ops_count += 1
            return 1;
        elif r3 > 0:
            ops_count += 1
            return 1;
        else:
            return 0;
    else:
        ops_count += 1
        if r2 == 0:
            ops_count += 1
            return 0;
        elif r3 == 0:
            ops_count += 1
            return 0;
        else:
            return 1;

if __name__ == "__main__":
    h = 2 # depth of the tree
    my_leaves = [random.randint(0,1) for i in range(0, 3**h)] #input vector
    ops_count = 0 #Counting the number of steps
    nb_trees = len(my_leaves) // 3
    my_trees = [ternary_tree(leaves=my_leaves[i:i+3]) for i in range(0, nb_trees)]
    new_leaves = []
    t1, t2, t3 = random.sample(my_trees, 3)
    new_leaves.append(t1.get_root())
    new_leaves.append(t2.get_root())
    if new_leaves[0] == new_leaves[1]:
        new_leaves.append(new_leaves[0])
    else:
        new_leaves.append(t3.get_root())
    ternary_tree(leaves=new_leaves).get_root()

I think the code does the job, but it is not optimal in the sense that it still checks all the internal node and doesn't skip the redundant nodes. I think the right approach is to have a recursive algorithm like a binary search tree, but I cannot make the link between BST and the majority tree evaluation.

I would appreciate if you could give me any indication on how to solve this problem. Thanks!

The illustration comes from here: http://www.math.ucsd.edu/~gptesler/188/MajTree/majtree.html.


Solution

  • Indeed, recursion would be the way to find the root value. I don't really see a need to create the tree data structure: when the values of all the leaf values are input as a list, then we really have all information in a suitable format.

    Here is how the majority function could look when it is given only the list of leaf values:

    import random
    
    def majority(leaves):
        counter = 0
    
        def recur(start, size):
            nonlocal counter
            if size == 1:
                counter += 1  # about to access a leaf's value
                return leaves[start]
            size //= 3
            # Randomly choose which child we will leave as "plan B" in case
            #  two others give opposite values
            last = random.randint(0, 2)
            # Get the values of the two other children, using recursion
            val1 = recur(start + ((last+1)%3)*size, size)
            val2 = recur(start + ((last+2)%3)*size, size)
            # If equal, we do not need to retrieve the remaining child's value
            if val1 == val2:
                return val1
            # Too bad,... we need the value of the third child to break the tie
            return recur(start + last*size, size)
        
        rootval = recur(0, len(leaves))
        return rootval, counter
    

    You would call it like this:

    h = 2
    leaves = [random.randint(0,1) for i in range(0, 3**h)]
    
    rootval, counter = majority(leaves)
    
    print("The leaves are {}".format(leaves))
    print("Accessed {} leaves to find that root's value is {}".format(counter, rootval))