Search code examples
javaarraysalgorithmbig-omergesort

How to merge an array of size M into another array of size 2M


I got this question in an online code challenge. I needed to merge two sorted arrays, one of size M with M elements into another one, with M elements and capacity 2M. I provided the following solution:

class ArraysSorting {

/**
 * Function to move elements at the end of array
 * 
 * @param bigger array
 */
void moveToEnd(int bigger[]) {
    int i = 0, j = bigger.length - 1;
    for (i = bigger.length - 1; i >= 0; i--)
        if (bigger[i] != 0) {
            bigger[j] = bigger[i];
            j--;
        }
}

/**
 * Merges array smaller array of size M into bigger array of size 2M
 * @param bigger array
 * @param smaller array
 */
void merge(int bigger[], int smaller[]) {
    moveToEnd(bigger);
    int i = smaller.length;
    int j = 0;
    int k = 0;
    while (k < (bigger.length)) {
        if ((i < (bigger.length) && bigger[i] <= smaller[j]) || (j == smaller.length)) {
            bigger[k] = bigger[i];
            k++;
            i++;
        } else {
            bigger[k] = smaller[j];
            k++;
            j++;
        }
    }
  }
}

Is there a more efficient way to do this?

The Time Complexity: O(2M)


Solution

  • You can't beat linear time because you have to at least scan all the 2M elements, so that's the best you can ever get.

    In practice though, you can optimize it a little further. There's no need to shift the elements of bigger towards the end; just write the results right-to-left rather than left-to-right (and of course, you'll need to invert the comparison, at any step you'll want to select the largest element rather than the smallest).

    Also, this is not good:

    if ((i < (bigger.length) && bigger[i] <= smaller[j]) || (j == smaller.length)) {
        /* ... */
    }
    

    You should test j == smaller.length before accessing smaller[j]; the code as is will possibly access out of bounds positions in smaller. Do this instead:

    if ((j == smaller.length) || (i < (bigger.length) && bigger[i] <= smaller[j])) {
        /* ... */
    }
    

    Overall, I do think you can make the code simpler, here's something that in my opinion is easier to read because the if conditions are smaller and easier to understand (it also approaches the traditional way in which you merge two arrays and avoids the extra O(M) work of shifting the elements to the back):

    void merge(int bigger[], size_t bigger_len, int smaller[], size_t smaller_len) {
        ssize_t smaller_i, bigger_i, idx;
    
        if (smaller_len == 0)
            return;
    
        smaller_i = smaller_len-1;
    
        if (bigger_len == 0)
            bigger_i = -1;
        else
            bigger_i = bigger_len-1;
    
        idx = bigger_len+smaller_len-1;
    
        while (smaller_i >= 0 && bigger_i >= 0) {
            if (bigger[bigger_i] > smaller[smaller_i]) {
                bigger[idx] = bigger[bigger_i];
                bigger_i--;
            }
            else {
                bigger[idx] = smaller[smaller_i];
                smaller_i--;
            }
            idx--;
        }
    
        while (smaller_i >= 0) {
            bigger[idx] = smaller[smaller_i];
            smaller_i--;
            idx--;
        }
    }
    

    It's easy to see that the first loop runs as long as a comparison between two elements in the different arrays is possible (rather than having the loop always run and use complicated if tests inside). Also note that since output is being written to bigger, once the first loop terminates, we only need to make sure that the rest (if any) of smaller that is left is copied over to bigger. The code is in C, but it's pretty much the same in Java. bigger_len and smaller_len are the number of elements in bigger and in smaller; it is assumed that bigger has enough space for bigger_len+smaller_len elements. The initial if tests to assign to smaller_i and bigger_i are necessary to handle edge cases where subtracting 1 would overflow the (unsigned) range of size_t; they are unnecessary in Java since Java doesn't have unsigned types (correct me if I'm wrong, I haven't done Java recently).

    Time complexity remains O(M).