Search code examples
pythonrecursiontreetraversal

Understanding recursion in binary tree traversal


I'm trying to understand how this code works:

# Definition for a binary tree node.
# class TreeNode:
#     def __init__(self, val=0, left=None, right=None):
#         self.val = val
#         self.left = left
#         self.right = right

class Solution:
    def inorderTraversal(self, root: Optional[TreeNode], result: List[int] = None) -> List[int]:
        if result is None:
            result = []
        if root:
            self.inorderTraversal(root.left, result)
            result.append(root.val)
            self.inorderTraversal(root.right, result)
        return result

Suppose our tree is [5, 8, 7] (i.e., 5 is the root, its left daughter is 8 and its right daughter is 7). First, the root is given as the input. The empty results list is created. The root is not None, so the body of the if statement is executed. The first line of the if statement is supposed to call the traversal function on the left daughter of the root, i.e., on root.left. So we're back to the line "if result is None". This first if statement is skipped. For the second if statement, root.left is true, so the statement in the body of the second if statement is executed. Namely, the function calls itself for a second time, this time on the argument root.left.left. So we go back to the line "if result is None". This first if statement is skipped again. Now we get to the line "if root.left.left". But root.left.left is None, so we should not execute the code in the body of the statement and we should return "result". Where's my mistake?


Solution

  • Your analysis of the process for the input [5,8,7] is correct. It just is not complete. It looks like you wonder about return result when it is still empty. But this is intended behaviour. Maybe you thought that return result would return the result to the very first caller (the code challenge framework), but that is not what happens. That result is returned to the previous caller. And that caller will then perform the append of their node's value to the same list .

    Maybe it helps to visualise the call stack with boxes that are placed on top of eachother. A return only removes a box from the top of the stack, not the whole stack.

    Execution flows from top of this diagram to bottom. The nested boxes represent the depth of the recursion:

    ┌─ root: Node 5, result: None ──────────────────────┐
    │ result = []                                       │
    │ self.inorderTraversal(root.left, result)          │
    │  ┌─ root: Node 8, result: [] ─────────────────┐   │
    │  │ self.inorderTraversal(root.left, result)   │   │
    │  │  ┌─ root: None, result: [] ─────────┐      │   │
    │  │  │ return result  # []              │      │   │
    │  │  └──────────────────────────────────┘      │   │
    │  │ result.append(root.val)  # [8]             │   │
    │  │ self.inorderTraversal(root.right, result)  │   │
    │  │  ┌─ root: None, result: [8] ────────┐      │   │
    │  │  │ return result  # [8]             │      │   │
    │  │  └──────────────────────────────────┘      │   │
    │  │ return result  # [8]                       │   │
    │  └────────────────────────────────────────────┘   │
    │ result.append(root.val)  # [8, 5]                 │
    │ self.inorderTraversal(root.right, result)         │
    │  ┌─ root: Node 7, result: [8, 5] ─────────────┐   │
    │  │ self.inorderTraversal(root.left, result)   │   │
    │  │  ┌─ root: None, result: [8, 5] ─────┐      │   │
    │  │  │ return result  # [8, 5]          │      │   │
    │  │  └──────────────────────────────────┘      │   │
    │  │ result.append(root.val)  # [8, 5, 7]       │   │
    │  │ self.inorderTraversal(root.right, result)  │   │
    │  │  ┌─ root: None, result: [8, 5, 7] ──┐      │   │
    │  │  │ return result  # [8, 5, 7]       │      │   │
    │  │  └──────────────────────────────────┘      │   │
    │  │ return result  # [8, 5, 7]                 │   │
    │  └────────────────────────────────────────────┘   │
    │ return result  # [8, 5, 7]                        │
    └───────────────────────────────────────────────────┘
    

    So the inner boxes are the deepest recursive calls. Each time the result list is passed to the deeper call, which may or may not add value(s) to it. If it doesn't add values it, it just means there are no nodes there. The caller will add their own node's value to it.

    Note also that return result is only needed because the top level caller expects the list as return value. All other times that a (recursive) call is made, the caller doesn't really care what is returned, because the caller already has the list: they have a reference to it with their own result name. If the recursive call added values to it, they will "have" those values, since they use the same list.