Search code examples
javaperformancesortingtime-complexitymergesort

Why is my implementation of merge sort slow?


I have implemented merge sort to use with an algorithm for my assignment. It works but with a large amount of test cases it takes very long to give back results and it shouldn't. I have cleared my main method and im convinced that merge sort is the issue. Can somone explain what did i do to make it so unefficient?

    public static double[] merge(double[] arrayA, double[] arrayB){
        double[] arrayC = new double[0];

        while((arrayA.length != 0) && (arrayB.length != 0)){
            if(arrayA[0]>arrayB[0]){
                arrayC = Array.arrAdd(arrayC, arrayB[0]);
                arrayB = Array.arrRem(arrayB);
            }else{
                arrayC = Array.arrAdd(arrayC, arrayA[0]);
                arrayA = Array.arrRem(arrayA);
            }
        }
        while(arrayA.length != 0){
            arrayC = Array.arrAdd(arrayC, arrayA[0]);
            arrayA = Array.arrRem(arrayA);
        }
        while(arrayB.length != 0){
            arrayC = Array.arrAdd(arrayC, arrayB[0]);
            arrayB = Array.arrRem(arrayB);
        }

        return arrayC;
    }

    public static double[] mergeSort(double[] arrayA){
        if(arrayA.length == 1){
            return arrayA;
        }

        int a;
        int b;

        if(arrayA.length % 2 != 0){
            a = (arrayA.length+1)/2;
            b = (arrayA.length-1)/2;
        }else{
            a = (arrayA.length)/2;
            b = (arrayA.length)/2;
        }

            double[] array1 = new double[a];
            double[] array2 = new double[b];

        for(int i=0; i<array1.length; i++){
            array1[i] = arrayA[0];
            arrayA = Array.arrRem(arrayA);

        }
        for(int i=0; i<array2.length; i++){
            array2[i] = arrayA[0];
            arrayA = Array.arrRem(arrayA);
        }

        array1 = mergeSort(array1);
        array2 = mergeSort(array2);

        return merge(array1, array2);
    }

I am also including my "arrAdd" and "arrRem" methods that are used to add an element at the end of an array and to remove one from index 0.

    public static double[] arrAdd(double[] arrayA, double value){
        double[] arrayB = new double[arrayA.length+1];
        System.arraycopy(arrayA, 0, arrayB, 0, arrayA.length);
        arrayB[arrayA.length] = value;
        return arrayB;
    }

    public static double[] arrRem(double[] arrayA){
        double[] arrayB = new double[arrayA.length-1];
        System.arraycopy(arrayA, 1, arrayB, 0, arrayA.length - 1);
        return arrayB;
    }


Solution

  • Because arrayRem is dreadfully inefficient.

    Suppose you have an array with 1 million elements. You want to go through these elements. You do so by looking at the first element, then removing the first element by copying all other elements one place closer to the front, and repeat this until the array is empty.

    So after consuming the first element, you are copying 999 999 elements.
    After consuming the second element, you are copying 999 998 elements.
    After consuming the third element, you are copying 999 997 elements.
    (999 996 lines omitted here)
    After consuming the 1 000 000th element, you are copying 0 elements.

    That is, to read through an array of length 1 million, you are copying a total of 1 000 000 * 500 000 = 500 000 000 000 elements. That takes a while.

    A correct implementation of mergesort would read through the array by incrementing an index, not by incrementally deleting the first element and copying all remaining elements every time.

    Similarly, an efficient way of adding elements works by creating an array big enough to hold all elements up front, and then just write each single element into its correct place by keeping track of the index the next element should be written to, rather than copying all existing elements for every element added.