Search code examples
python-3.xpathbinary-treedepth-first-search

Print path with recursive DFS with Python


I wrote this code to print path from root to target node in binary tree using recursive DFS:

    def dfs(self, current, target, path=[]):
        if current == None:
            path = []
            return False
        elif current.val == target.val:
            return True
        else:
            path.append(current.val)
            return self.dfs(current.left, target, path) or 
                   self.dfs(current.right, target, path)
 

When I run this code, I notice the path variable has a list of all nodes that it traversed to get to the target node rather than just the direct path from root to target. What am I missing? I know I might be missing resetting the path variable somewhere, but I am not sure..


Solution

  • There are these issues:

    • A syntax issue (probably because you changed it in your post): the last statement is split over two lines, which is not allowed. If you want that, add parenthesis, or append \ after the first line of these two.

    • path = [] (inside the first if block) is a useless statement: it just assigns a new value to a local path name, and never uses it again. Maybe you thought it would alter the path of the caller, but assignment never does that in Python. On the other hand, if you mutate the list that path already references, then that is the caller's list that mutates. For instance, this is what happens append. But even if in this case you must empty the caller's list, that would not be the correct logic, since the target node might still be on a path that has some of the same ancestors. In short, this path = [] statement should just be removed.

    • The path will get nodes appended as they are visited, even when the two recursive calls return False, and False is returned for the current call, that node will still remain on the path. You should remove it from the path -- and only that node.

    • Giving the path parameter a default value of [] is a bad idea. This means that if you don't pass this argument, this default list is being used, and it will still be that list if you make yet another call without that argument. This will mix up results from two different searches. Instead, you could use None as default, and have the code set it to a new empty list when it is found to be None.

    • Not a problem, but it would make sense to also append the target node to the path.

    So with those fixes, your code would look like this:

        def dfs(self, current, target, path):
            if current is None:
                return False
            # Append the node's value, also when it is the target:
            path.append(current.val)
            if current.val == target.val:
                return True
            if (self.dfs(current.left, target, path) or 
                    self.dfs(current.right, target, path)):
                return True 
            # If recursive calls were unsuccessful, remove current
            #    node from the path
            path.pop()
            return False
    

    Now it will work. Still, I would suggest some changes.

    I would not use a path argument at all: it is non-intuitive that the original caller first has to create a path (set to an empty list) and then provide it to this function. The function should take care of that itself.

    Have the function return the path. The boolean that you return now can be removed: instead return None if the target was not found, and a list (the path) when the target was found.

    Instead of appending and then popping again (because the target was not yet found), you could only start building the path when you have actually found the target, and then while unwinding from recursion, add the ancestor nodes to that path. So instead of building the path in preorder fashion, build it in postorder fashion.

    Here is how that idea can be implemented:

        def dfs(self, current, target):
            def recur(current):
                if current is None:
                    return
                if current.val == target.val:
                    return [target]  # Start the path from the last node backwards
                reversed_path = recur(current.left) or recur(current.right)
                if reversed_path:
                    # Append an ancestor once the target was found:
                    reversed_path.append(current)
                    return reversed_path
                    
            reversed_path = recur(current)
            # Don't return a boolean, but the path or None 
            if reversed_path:
                return reversed_path[::-1]
    

    With this variant, the caller must not provide a path argument, but instead gets the path as return value. They should check whether this path is None, which would indicate the target node was not found.