Search code examples
algorithmsortingtime-complexitymergesortcounting-sort

Find an algorithm for sorting integers with time complexity O(n + k*log(k))


Design an algorithm that sorts n integers where there are duplicates. The total number of different numbers is k. Your algorithm should have time complexity O(n + k*log(k)). The expected time is enough. For which values of k does the algorithm become linear?

I am not able to come up with a sorting algorithm for integers which satisfies the condition that it must be O(n + k*log(k)). I am not a very advanced programmer but I was in the problem before this one supposed to come up with an algorithm for all numbers xi in a list, 0 ≤ xi ≤ m such that the algorithm was O(n+m), where n was the number of elements in the list and m was the value of the biggest integer in the list. I solved that problem easily by using counting sort but I struggle with this problem. The condition that makes it the most difficult for me is the term k*log(k) under the ordo notation if that was n*log(n) instead I would be able to use merge sort, right? But that's not possible now so any ideas would be very helpful.

Thanks in advance!


Solution

  • Here is a possible solution:

    • Using a hash table, count the number of unique values and the number of duplicates of each value. This should have a complexity of O(n).

    • Enumerate the hashtable, storing the unique values into a temporary array. Complexity is O(k).

    • Sort this array with a standard algorithm such as mergesort: complexity is O(k.log(k)).

    • Create the resulting array by replicating the elements of the sorted array of unique values each the number of times stored in the hash table. complexity is O(n) + O(k).

    • Combined complexity is O(n + k.log(k)).

    For example, if k is a small constant, sorting an array of n values converges toward linear time as n becomes larger and larger.

    If during the first phase, where k is computed incrementally, it appears that k is not significantly smaller than n, drop the hash table and just sort the original array with a standard algorithm.