Search code examples
arraysalgorithmdata-structuresbinary-treegraph-theory

What is the array representation of this (incomplete) tree?


According to LC, the array representation of this tree (let's call it a) is a = [1, NULL, 2, 3] enter image description here

However, this violates the algorithm that the left child of a root in position i of a is in position 2*i and the right child of a root in position i of a is in position 2*i+1 (source)

My questions are:

  1. In the case of incomplete trees like the one above, do you write out all the missing nodes as NA's (except at the very end of the binary tree)? In the solution given, not all the missing nodes are written out. In particular 1's left child is missing and is represented as NULL, but its children are not even included in the array representation. And therefore, even though 3 is 2's left child, its position (4) doesn't follow the formula of 2*i, where i is 2's position.
  2. If the solution given above ([1, NULL, 2, 3]) is the correct array representation of the binary tree given, then what is the formula for identifying the positions of a node's children when the node in question is in position i? And what is the rule of when a missing node should or should not be included in the array representation of an incomplete binary tree?

Context: I'm trying to code up an algorithm that will take the array representation of a tree as an input and output the in-order traversal of that tree.


Solution

  • In the case of incomplete trees like the one above, do you write out all the missing nodes as NA's (except at the very end of the binary tree)?

    We discuss serialization here (what LeetCode innternally uses as input format is text, not an array), and there are many ways to serialize a tree.

    In the case of LeetCode's format, the serialization uses a JSON format, in a breadth first order, where null (lowercase!) denotes the absence of a node where it would have been possible to have a node. You wonder why the left-side grandchildren of node 1 are not represented with null in the serialized format. LeetCode's format does not include them, because they cannot be anything else than null as their parent is already null, and therefore it is redundant information. True, it violates the formula where the positions of a node's children are at i*2 and i*2+1, but that is a formula that is used in different representations, such has commonly used for a binary heap.

    If the solution given above ([1, NULL, 2, 3]) is the correct array representation of the binary tree given, then what is the formula for identifying the positions of a node's children when the node in question is in position i?

    There is no direct formula for that; not when the only information you have is a parent node's index.

    what is the rule of when a missing node should or should not be included in the array representation of an incomplete binary tree?

    The rule is: include a null when it is a direct child of a (non-null) parent node. Do not include it when that parent is non-existent (null).

    I'm trying to code up an algorithm that will take the array representation of a tree as an input and output the in-order traversal of that tree.

    The most straightforward way to approach this, is to first build the tree data structure using typical node objects (with left/right attributes). And in a second phase produce the traversal you want from it.

    To create the node-based data structure, you can use a breadth first traversal on the array representation, where you only push non-null nodes on the queue. Here is pseudo code for it:

    function makeTree(arr):
        if arr.length == 0:
            return null // empty tree
        nodes = [new TreeNode(arr[0])]
        for (j = 1, i = 0; j < arr.length; j++):
            node = arr[j] == null ? null : new TreeNode(arr[j])
            if j % 2 == 1:
                nodes[i].left = node
            else
                nodes[i].right = node
                i++
            if node != null:
                nodes.push(node)
        return nodes[0]
    

    In answer to other questions, I have provided implementations of this algorithm in JavaScript and Python.