Search code examples
pythonpython-3.xalgorithmtime-complexitybig-o

How do I identify O(nlogn) exactly?


I have understood O(logn) in a sense that it increases quickly but with larger inputs, the rate of increase retards. I am not able to completely understand

  1. O(nlogn)

  2. the difference between an algorithm with complexity nlogn and complexity n + logn.

I could use a modification of the phone book example and/or some basic python code to understand the two queries


Solution

  • How do you think of O(n ^ 2)? Personally, I like to think of it as doing O(n) work O(n) times.

    A contrived O(n ^ 2) algorithm would be to iterate through all pairs of numbers in 0, 1, ..., n - 1

    def print_pairs(n):
        for i in range(n):
            for j in range(i + 1, n):
                print('({},{})'.format(i, j))
    

    Using similar logic as above, you could do O(log n) work O(n) times and have a time complexity of O(n log n).

    As an example, we are going to use binary search to find all indices of elements in an array.

    Yes, I understand this is a dumb example but here I don't want to focus on the usefulness of the algorithm but rather the complexity. For the sake of the correctness of our algorithm let us assume that the input array is sorted. Otherwise, our binary search does not work as intended and could possibly run indefinitely.

    def find_indices(arr):
        indices = []
        for num in arr:
            index = binary_search(arr, 0, len(arr), num)
            indices.append(index)
        return indices
    
    def binary_search(arr, l, r, x): 
    
        # Check base case 
        if r >= l: 
    
            mid = l + (r - l)/2
    
            # If element is present at the middle itself 
            if arr[mid] == x: 
                return mid 
    
            # If element is smaller than mid, then it  
            # can only be present in left subarray 
            elif arr[mid] > x: 
                return binary_search(arr, l, mid-1, x) 
    
            # Else the element can only be present  
            # in right subarray 
            else: 
                return binary_search(arr, mid + 1, r, x) 
    
        else: 
            # Element is not present in the array 
            return -1
    

    As for your second question, surely, log n << n as n tends to infinity so

    O(n + log n) = O(n)

    In theory, the log n is dwarfed by the n as we get arbitrarily large so we don't include it in our Big O analysis.

    Juxtaposed to practice, where you might want to consider this extra log n work if your algorithm is suffering performance and/or scaling issues.