Search code examples
arraysalgorithmmedian

Finding a median of 2 arrays of the same size - the O(log n) algorithm doesn't yield a correct result


I'm trying to solve the problem of calculating median of merged two sorted arrays of the same size, with distinct elements.

Source of algorithm: https://www.geeksforgeeks.org/median-of-two-sorted-arrays/ This algorithm is used in several sources over the internet as the O(log n) solution. However I don't think it works for the example I made up.

My counter-example:

We have 2 sorted arrays with no duplicates:

[2,3,12,14] & [1,5,8,9]

Merged sorted array is: a = [1,2,3,5,8,9,12,14] Median: 13/2 = 6.5

Following the algorithm:

Median of [2,3,12,14] is (3+12)/2= 7.5 = m1

Median of [1,5,8,9] is (5+8)/2 = 6.5 = m2

We see m1>m2. So following the algorithm, we consider the first half of first array, and second half of second array. We have a1 = [2,3] and a2 = [8,9].

Now we reached a base case and we have a result of (max(a1[0],a2[0]) + min(a1[1],a2[1]))/2 = 8+3=11/2=5.5 which is clearly not 6.5.

This is the only algorithm I see that has O(log n) solution but it seems flawed. Is there something I'm missing here?


Solution

  • To always give the same result of the first method, the second one must end up with the same numbers in the final iteration.

    For instance, the example provided should lead to 6.5

    [2, 3, 12, 14], [1, 5, 8, 9] → [1, 2, 3, 5, 8, 9, 12, 14] → (5 + 8)/2 → 6.5

    To ensure that, when dividing ranges with even number of elements, you must add the element below are beyond the middle:

    [2, 3, 12, 14], [1, 5, 8, 9] → [2, 3, 12], [5, 8, 9] → [3, 12], [5, 8] → 6.5

    As a matter of fact, the relevant part of the code in the page you linked is this

    int getMedian(int ar1[],  
                  int ar2[], int n)
    {
        // ...
        if (m1 < m2) 
        { 
            if (n % 2 == 0) 
                return getMedian(ar1 + n / 2 - 1,    // <- Note the difference
                                 ar2, n - n / 2 + 1); 
            return getMedian(ar1 + n / 2,            // <-
                             ar2, n - n / 2); 
        } 
    
        if (n % 2 == 0) 
            return getMedian(ar2 + n / 2 - 1,        // The same here
                             ar1, n - n / 2 + 1); 
        return getMedian(ar2 + n / 2,  
                         ar1, n - n / 2);