Search code examples
algorithmperformancepriority-queuebinary-heapheapq

How to improve this code for finding the k largest elements of an array?


The following code to find the k largest elements of an array is causing a TLE error. How can I optimize it to make it run faster?

import heapq    
for _ in range(int(input())):         
    n,k=map(int,input().split())   
    lists=list(map(int,input().split()))  
    
    heapq.heapify(lists)       
    
    for i in range(k+1):
        klargest=heapq.nlargest(i,lists)  
    
    print(*klargest)  

Solution

  • for i in range(k+1):
       klargest=heapq.nlargest(i,lists)  
    

    The time complexity of each klargest operation is O(k*log n)) where n is the number of elements in heap. In the above code snippet, this operation is running for k+1 times for the values [0,k].

    Calculating the time for the loop:

    iteration value (Time)

    i == 0 (0*log(n))

    i == 1 (1*log(n))

    i == 2 (2*log(n))

    ....

    i == k-1 ((k-1)*log(n))

    i == k ((k)*log(n))

    Total time will be the sum of time taken in each operation = (0.log(n)) + (1*log(n)) + .... + ((k-1)*log(n)) + ((k)*log(n))

    Total time = (0+1+2...+(k-1)+k)log(n) = ((k(k+1))/2)*log(n)

    Total time ~~ O(k^2*(log(n)))

    That's why the above code results in TLE.

    OPTIMISED APPROACH:

    import heapq    
    for _ in range(int(input())):         
        n,k=map(int,input().split())   
        lists=list(map(int,input().split()))  
        
        heapq.heapify(lists)       
        
        for i in range(n-k):
            heapq.heappop(lists)
        klargest = list(lists) # converting heap to list
        print(*klargest) 
    

    As Inbuilt heap in python is min-heap. So the above code is popping out minimum n-k elements from lists. Popping out each operation will take log(n) time. Thus total time will be ~~ (n-k)*logn. The remaining k elements in the heap are the klargest elements that we want to find.

    Thus, the time complexity for the above solution is O((n-k)*log(n)) == O(nlog(n)) which is a optimised time complexity.