Here's the question:
Consider the following sorting algorithm: I. Insert the given input A[1], A[2], ..., A[n] into a binary search tree T in the given order, starting from an empty tree; II. Do an inorder traversal of T to output the elements in non-decreasing order. (a) What is the worst case time for the algorithm? (b) What is the best case time for the algorithm?
My answer:
I created a BST in t(n) = 2(n/2) + n
as we have to go through each individual element, I also then printed out the BST inorder in t(n) = 2(n/2) + n
as again we have to go through each element. I then added the time complexities together ending with, t(n) = 4(n/2)+2n
.
According to Master's method, this complies with case 1. Making it θ(n^2)
. If it is θ(n^2)
, this answers both a and b of the question. I don't see how there could be a better performance, or in that case a worse performance.
Is this a trick question?
Short answer:
You added the RHS of the two equations but not the LHS – it should be
2T(n)
.
Sure, the verbal description of T(n)
is the "time complexity function", but of what? In this case it should the time complexity of either the BST creation or in-order traversal, not both together.
So the end result is 2T(n) = 4T(n/2) + 2n
. Cancel the 2, and you get O(n log n)
.
tl;dr answer:
But wait, why
n log n
for traversal too? There are onlyn
elements, right?
By splitting the problem into two equal halves T(n/2)
, you have implicitly assumed that the tree is balanced after every insertion. For a simple non-self-balancing BST tree implementation, this will not always be the case. If the tree is "tilted" to one side, an insertion could be as much as O(n)
(i.e. O(n^2)
construction).
Fortunately, for most commonly used self-balancing tree types, e.g. Red-Black and AVL, insertion is guaranteed to be O(log n)
or below. Therefore the correct recurrence relation for tree construction is T(n) = T(n - 1) + log n
, which works out to be O(n log n)
(Stirling's equation).
Nonetheless, traversal for any BST, balanced or not, is O(n)
, since there are only n
elements. This means that your recurrence relation for traversal is wrong - it should be T(n) = 2T(n/2) + 1
, since only O(1)
work is done for each node.