I'm currently taking a course on algorithms and data structures, but I'm having some difficulty understanding how to calculate the complexity of this function as it involves a recursive function within another recursive function. Could someone provide some guidance, please?
Let x be the root of a Binary Search Tree (BST), and let's assume that the number of nodes is n.
DISCLAIMER: I am well aware that this function is highly inefficient. These functions have been written for illustrative purposes only.
Pseudocode:
countForEach(x){ //prints the number of nodes for each subtree of the tree rooted at x
if(x != NIL){
CountForEach(x.left)
CountForEach(x.right)
print(countRec(x))
}
}
countRec(y) //returns the number of nodes in the subtree rooted in y
if(y == NIL):
return 0
else:
return countRec(y.left) + countRec(y.right) + 1
I know that countRec(x)
has a time complexity of Θ(m), where m is the number of nodes in the subtree rooted at x and that countRec(y)
is called by countForEach(x)
n times where n is the number of nodes of the tree rooted at x.
What prevents me from reaching a solution is that m decreases as CountForEach
executes.
In comments you've already derived what the result would be if the tree were degenerate, i.e. where each node only has one child, except for the (single) leaf node: 𝑛+(𝑛−1)+...+3+2+1 = 𝑛(𝑛+1)/2, which is θ(𝑛²). This is a worst case scenario in terms of 𝑛.
A best case scenario occurs when the tree is perfect, i.e. where there is no node that has one child only and all leaves are at the same level.
Let ℎ be the height of the perfect binary tree; the number of nodes 𝑛 is then equal to 2ℎ+1−1.
Counting the number of calls of countRec
in a top-down manner, we get these terms per level of the tree:
2ℎ+1−1
+ 21⋅(2ℎ−1)
+ 22⋅(2ℎ−1−1)
+ 23⋅(2ℎ−2−1)
+ ...
+ 2ℎ
Writing each term as the subtraction of two powers of 2:
= 2ℎ+1−20
+ 2ℎ+1−21
+ 2ℎ+1−22
+ 2ℎ+1−23
+ ...
+ 2ℎ+1−2ℎ
Adding up the terms:
= (ℎ+1)2ℎ+1−(2ℎ+1−1)
= ℎ2ℎ+1+1
As in this perfect tree 𝑛=2ℎ+1−1, this translates to:
= (log2(𝑛+1)−1)(𝑛+1) + 1 which is θ(𝑛log𝑛)
In conclusion we can say the worst case time complexity is θ(𝑛²) and the best case time complexity is θ(𝑛log𝑛).