Search code examples
algorithmbinary-treebinary-search-treered-black-tree

Given a red-black tree on n nodes, what is the maximum number of red nodes on any root to leaf path?


This was a quiz question. I'm not sure whether my answer was right. Please help me out.

Lets say the height is h, since no two consecutive nodes (as we go up the tree) can be red, wouldn't the max number of red nodes be h/2? (h = log n)

Somehow, I feel that is not the correct answer.

Any help/input would be greatly appreciated!

Thank you so much in advance!


Solution

  • ** Edit ** This answer assumes a definition of height to be the number of nodes in the longest path from root to leaf (used, e.g., in lecture notes here) including the "virtual" black leaf nodes. More common definition counts the number of edges, and does not include the leaf nodes. With this definition the answer is round(h/2), and if you include the leaf nodes in to height round_down(h/2). ** Edit ends **

    If you follow the rules that the root node is black as in Wikipedia, then the correct answer is the largest integer smaller than h/2. This is just because root and leaves are black, and half of the nodes (rounded up) in between can be red. I.e. round((h-2)/2)

    You can also find the rule just by considering some small red-black trees of different heights.

    Case h=1 root is black -> 0 red nodes

    Case 'h=2' root is black and leaves are black -> 0 red nodes

    Case h=3 root is black, second level can be red, and leaves must be black -> max 1 red node

    Case h=4 root is black, second level can be red, third level must be black, and leaves must be black -> max 1 red node

    Case h=5 black, red, black, red, black -> max 2 red nodes.

    The h as a function of n is trickier, but it can be shown that h <= 2 log (n+1), which guarantees the logarithmic search time. For a proof see, e.g., Searching and Search Trees II (page 11). The proof is based on the fact that the rules of red-black tree guarantee that a subtree starting at x contains at least 2^(bh(x)) - 1 internal nodes, where bh(x) is the black height - number of black nodes in path from root to leaf. This is proven by induction. Then by noting that at most half of the nodes are black (we are speaking of subtrees so the root can be red) that bh(x) >= h/2. Now using these results we get n >= 2^bh(x) - 1 >= 2^(h/2) -1. Solving for h, we get the answer h <= 2 log(n+1).

    As the question was a quiz, it should be enough to say that h is proportional to log(n) or even about log(n).