Search code examples
pythontreegraph-theorybreadth-first-search

Python code to do breadth-first discovery of a non-binary tree


My problem: I have a known root node that I'm starting with and a specific other target node that I'm trying to find the shortest path to. I'm trying to write Python code to implement the Iterative Deepening Breadth-First Search algo, up to some max depth (say, 5 vertices).

However, there are two features that (I believe) make this problem unlike virtually all the other SO questions and/or online tutorials I've been able to find so far:

  1. I do not yet know the structure of the tree at all: all I know is that both the root and target nodes exist, as do many other unknown nodes. The root and target nodes could be separated by one vertice, by 5, by 10, etc. Also, the tree is not binary: any node can have none, one, or many sibling nodes.

  2. When I successfully find a path from the root node to the target node, I need to return the shortest path between them. (Most of the solutions I've seen involve returning the entire traversal order required to locate a node, which I don't need.)

How would I go about implementing this? My immediate thought was to try some form of recursion, but that seems much better-suited to Depth-First Search.

TLDR: In the example tree below (apologies for ugly design), I want to traverse it from Root to Target in alphabetical order. (This should result in the algorithm skipping the letters K and L, since it will have found the Target node immediately after J.) I want the function to return:

[Root, B, E, H, Target]

enter image description here


Solution

  • You're basically looking for the Dijkstra algorithm. Dijkstra's algorithm adapts Breadth First Search to let you find the shortest path to your target. In order to retrieve the shortest path from the origin to a node, all that needs to be stored is the parent for each node discovered

    Let's say this is your tree node:

    class TreeNode:
        def __init__(self, value, children=None, parent=None):
            self.value = value
            self.parent = parent
            self.children = [] if children is None else children
    

    This function returns the path from the tree root node to the target node:

    from queue import Queue
    
    def path_root2target(root_node, target_value):
        def build_path(target_node):
            path = [target_node]
            while path[-1].parent is not None:
                path.append(path[-1].parent)
            return path[::-1]
        q = Queue()
        q.put(root_node)
        while not q.empty():
            node = q.get()
            if node.value == target_value:
                return build_path(node)
            for child in node.children:
                child.parent = node
                q.put(child)
        raise ValueError('Target node not found')
    

    Example:

    >>> D = TreeNode('D')
    >>> A = TreeNode('A', [D])
    >>> B = TreeNode('B')
    >>> C = TreeNode('C')
    >>> R = TreeNode('R', [A, B, C])
    >>> path_root2target(R, 'E')
    ValueError: Target node not found
    >>> [node.value for node in path_root2target(R, 'D')]
    ['R', 'A', 'D']
    

    If you want to return the node values (instead of the nodes themselves, then just modify the build_path function accordingly.