Search code examples
javaarraysalgorithmsortingarray-merge

Merging 4 sorted Arrays into one


I have this method to merge 2 sorted arrays into one sorted array:

    public void merge(T[] a, int l1, int r1, T[] b, int l2, int r2, T[] c, int l3) {
        while (l1 < r1 && l2 < r2) {
            if (a[l1].compareTo(b[l2]) < 0) {
                c[l3++] = a[l1++];
            } else
                c[l3++] = b[l2++];
        }

        while (l1 < r1)
            c[l3++] = a[l1++];
        while (l2 < r2)
            c[l3++] = b[l2++];
    }

But now I want to do it with 4 arrays at once.

I tried really long to come up with a solution, but wasn’t really successful. Does somebody have an idea how to do it?


Solution

  • I've generalized the problem to "merging N sorted arrays into a single sorted array".

    The code provided in the question utilizes generics. But it introduces a problem because arrays are not type-safe. In short, there's a substantial difference in their behavior: arrays are covariant and, on the other hand, generics are invariant. Due to that, compiler will not be abler to identify a problem when generics and arrays are mixed. It's a good practice to avoid usage of generic arrays.

    Also, I've taken into account that it is clearly an algorithmic problem (therefore its audience broader than readers who have a deep insight in Java, which is required to grasp generic-based implementation) I've decided to create two flavors of solution one using arrays exclusively, another with generics and Collections framework.

    Non-generic version

    Below is the description of how to merge an arbitrary number of sorted arrays of primitives:

    • find the total number of elements and create a resulting array based on it;
    • define an array that will maintain a current position in each of the source arrays;
    • using a nested for loop for each position in the resulting array, pick the lowest value of all currently accessible values.

    The time complexity of this algorithm is O(n * m) (where n - is the total number of elements in all arrays and m is the number of arrays).

    The implementation might look like this:

    public static int[] mergeNSorted(int[]... arrays) {
        int[] result = new int[getTotalLength(arrays)];
        int[] positions = new int[arrays.length]; // position for each array
        
        for (int pos = 0; pos < result.length; pos++) {
            int minCurVal = Integer.MAX_VALUE;
            int curArr = 0;
            for (int i = 0; i < arrays.length; i++) {
                if (positions[i] < arrays[i].length && arrays[i][positions[i]] < minCurVal) {
                    minCurVal = arrays[i][positions[i]];
                    curArr = i;
                }
            }
            result[pos] = minCurVal;
            positions[curArr]++;
        }
        return result;
    }
    
    public static int getTotalLength(int[][] arrays) {
        long totalLen = 0;
        for (int[] arr : arrays) totalLen += arr.length;
        
        if (totalLen > Integer.MAX_VALUE) throw new IllegalArgumentException("total length exceeded Integer.MAX_VALUE");
        return (int) totalLen;
    }
    

    main() - demo

    public static void main(String[] args) {
        int[][] input =
            {{1, 3}, {}, {2, 6, 7}, {10}, {4, 5, 8, 9}};
    
        System.out.println(Arrays.toString(mergeNSorted(input)));
    }
    

    Output

    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    

    Generic version

    In this version, input considered to be a list containing multiple lists of generic type T which expected to implement Comparable interface.

    This solution enhances the array-based implementation provided above, reducing the overall time complexity to O(n * log m) (where n - is the total number of elements in all arrays and m is the number of arrays).

    Instead of performing m iteration for each resulting element it maintains a PriorityQueue, which in this case represents a Min-Heap (i.e. when a head element is being retrieved from it, it'll have the lowest value of all the elements that are present in the queue).

    Every element in queue wraps the value of a particular element retrieved from one of the given lists, as well the data regarding the source of this value (i.e. an index of the list and a position inside this list).

    This wrapper over the element of the nested list can be represented by the class shown below.

    public class ElementWrapper<V extends Comparable<V>> implements Comparable<ElementWrapper<V>> {
        private V value;
        private int listIndex;
        private int position;
        
        public ElementWrapper(V value, int listIndex, int position) {
            this.value = value;
            this.listIndex = listIndex;
            this.position = position;
        }
        
        // getters
        
        @Override
        public int compareTo(ElementWrapper<V> o) {
            return value.compareTo(o.getValue());
        }
    }
    

    Note, that this class implements the of Comparable interface based on the value of wrapped list element.

    The queue is being prepopulated with the first element of each non-empty list. And then until the queue is not empty, its lowest element is being removed and gets added to the resulting list. Also, if a list to which the latest element retrieved from the queue points, has more elements, the next of them will be added into the queue.

    Note that both operations of adding a new element into the priority queue add() and removing its head element remove() according to the documentation has a cost of O(n) time (where n is the number of elements in the queue).

    The same time complexity can be achieved by utilizing a TreeSet instead, but in practice PriorityQueue will perform better because a heap is easier to maintain than a red-black tree.

    The code might look like this:

    public static <T extends Comparable<T>> List<T> mergeNSorted(List<List<T>> lists) {
        List<T> result = new ArrayList<>();
        Queue<ElementWrapper<T>> queue = getInitializedQueue(lists);
        
        while (!queue.isEmpty()) {
            ElementWrapper<T> next = queue.remove();
            result.add(next.getValue());
            
            if (next.getPosition() + 1 < lists.get(next.getListIndex()).size()) {
                queue.add(new ElementWrapper<>(lists.get(next.getListIndex()).get(next.getPosition() + 1),
                                               next.getListIndex(),
                                               next.getPosition() + 1));
            }
        }
        return result;
    }
    
    public static <T extends Comparable<T>> Queue<ElementWrapper<T>> getInitializedQueue(List<List<T>> lists) {
        Queue<ElementWrapper<T>> queue = new PriorityQueue<>();
        for (int i = 0; i < lists.size(); i++) {
            if (lists.get(i).isEmpty()) continue;
            queue.add(new ElementWrapper<>(lists.get(i).get(0), i, 0));
        }
        return queue;
    }
    

    main() - demo

    public static void main(String[] args) {
        List<List<Integer>> genericInput =
            List.of(List.of(1, 3), List.of(), List.of(2, 6, 7), List.of(10), List.of(4, 5, 8, 9));
        
        System.out.println(mergeNSorted(genericInput));
    }
    

    Output

    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]