Search code examples
algorithmheapheapsort

Find the top K elements in O(N log K) time using heaps


Let's say if i have a list containing:

lst = [4,0,8,3,1,5,10]

and I'm planning to use a heap structure to help me retrieve the top k largest number where k is a user input.

I understand that heap sort is O(N log N) where we first take O(N) time to place them in a min/max heap and the O(log N) time to retrieve elements.

But the problem I'm facing now is that I'm required to retrieve the top k users in O(N log K) time. If my k is 4, i would have:

[10,8,5,4] 

as my output. The thing I'm confused about is, at the early stage of forming the heap, am i supposed to heap the entire list in order to retrieve the top k elements?


Solution

  • The log K term would suggest that you would only want a heap of size K. Here is one possible solution.

    Start with an unsorted array. Convert the first K elements to a min-heap of size K. At the top of the heap will be your smallest element. Successively replace the smallest element with each of the remaining N - K elements in the array (that do not constitute a part of the heap), in O(log K) time. After O(N) such operations, the first K elements in the array (or, the K elements of the heap you created) will now have the K largest elements in your array.

    There are other solutions but this is the most straightforward.