Search code examples
algorithmdata-structurestreelanguage-agnosticmax-heap

How many comparisons a call to removeMin() will make in max heap of 7-ary tree?


Assume that a max heap with 10^6 elements is stored in a complete 7-ary tree. Approximately how many comparisons a call to removeMin() will make?

  • 5000
  • 50
  • 10^6
  • 500
  • 5

My solution: The number of comparisons should be equal to the number of leaf nodes at most because in max heap, the min. can be found at any of the leaf nodes which is not in the above options. Better approach was to take the square of ( log of 10^6 to the base 7) which gives 50 but this is only when we are sure that the minimum element will follow a single branch across tree which in the case of max heap is not correct. I hope that you can help.


Solution

  • There's no "natural" way to remove the minimum value from a max heap. You simply have to look at all the leaf nodes to figure out which one happens to be the minimum.

    The question then is how many leaf nodes there are. Intuitively, we'd expect the fraction of nodes in the heap that are leaves to be pretty close to the total number of nodes. Take it to the limit - if you have a 1,000,000-ary heap, you'd have one node in the top layer and all remaining 999,999 elements in the next layer. Even in the smallest case where the heap is a binary heap, you'd expect roughly half the elements to be in the bottom layer.

    More specifically, let's do some math! How many leaves will a 7-ary heap with n nodes have? Well, each node in the tree will either

    • be a leaf, or
    • have seven children,

    with one possible exception that, since the bottommost row might not be full, there might be one node with fewer than seven children. Since that's just a one-off, we can ignore that last node when we're dealing with millions of elements. A quick proof by induction can be used to show that any tree where each node either has no children or seven children will have seven times as many leaf nodes as internal nodes (prove this!), so we'd expect than (7/8)ths of the nodes will be leaves, for a total of 875,000 leaves to check.

    As a result, the best answer here would be roughly 106 comparisons.