Search code examples
algorithmselectionbig-omedian-of-medians

Selection routine with dupes?


I recently wrote a program after analyzing the k-th smallest element algorithm, first without the case of duplicates.

Now, however, I would like to analyze the expected asymptotic runtime for finding, say, the median of an array when there are exactly j duplicates. I haven't modified my code for such, and thus the performance slows down a bit because of the j duplicates.

I'm not sure how to begin? Can anyone point me towards such a recurrence relation?

I've derived the following, where n is the size of the input array

T(n) <= 1/2*T(3/4*n) + 1/2*T(n)

but am quite unclear how to proceed with duplicate keys involved.

Thanks


Solution

  • The randomized solution as demonstrated here is

     T(n) <= T(3/4*n) + n-1  =>  T(n) <= 4n
    

    The complexity of the algorithm may depend on j but don't expect it to be miraculously less than linear time. Why? take a random array of size n/2, duplicate it completely and run the ideal algorithm for the problem with duplicates. You'll have

    T(n) <= 4(n/2) => T(n) <= 2n
    

    when each element is duplicated twice (exactly n/2 duplicates)