Search code examples
pythonalgorithmmedian-of-medians

Complexity of finding k-th largest element using Median of Medians


I was reading the article on finding the k-th highest element in an array through the median-of-medians algorithm at ardendertat. In the portion explaining the complexity, the author seems to have discounted one factor, the cost of finding the median-of-medians for each partition recursively. Surely I can't partition all sub-arrays by the initial pivot, right? So won't this add to the complexity?


Solution

  • The position p of the median of medians after partitioning the array is between 0.3*n and 0.7*n.

    So after one round, we have three possibilities:

    1. p == n-k, by pure luck, the first pivot is the k-th largest element, found in O(n) (median of medians and partitioning are both O(n)).
    2. p > n-k, then the k-th largest element is smaller than the first pivot, and we need to find the k - (n-p)-th largest element of the first part, which has at most 0.7*n elements, so the total cost for finding the k-th largest element is

      T(n) <= T(0.7*n) + C*n
      
    3. p < n-k, then we need to find the k-th largest element of the second part (after the pivot), that part is also at most 0.7*n elements large, so we again have the estimate

      T(n) <= T(0.7*n) + C*n
      

    Iterating, we find

    T(n) <= T((0.7)^k * n) + C*(1 + 0.7 + ... + (0.7)^(k-1))*n
         <= T(1) + C/(1 - 0.7)*n
    

    which evidently is O(n).