Search code examples
javaalgorithmperformancesortingshellsort

Insertion sort much faster than shell sort


I'm reading chapter about sorting in Sedgewick's "Algorithms". Along the way I wrote 3 basic algorithms for sorting: selection, insertion and shell sort. The book says, that, despite the fact that all three have quadratic worst case complexity, shell sort should be much faster than insertion sort on random data. In the book they get 600x performance boost.

But I get following multipliers (almost do not change with increasing of array size) on my laptop:

  • selection: 5.5x
  • insertion: 1x
  • shell: 1.8x !

The question that bothers me is - why shell sort is almost two times slower, than insertion sort?!

I guess, something is wrong with my shellsort implementation. But I almost copied it from the book:

class ShellSort extends Sort {
    //precalculate sequence: 1, 4, 13, 40, 121, 364, 1093, ...
    //(3^20 - 1)/2 is enough for any array I sort
    private static final int[] SEQUENCE = new int[20];
    static {
        for(int i = 1; i <= SEQUENCE.length; i++)
            SEQUENCE[i - 1] = (int)(Math.pow(3, i) - 1) / 2;
    }

    public void sort(int[] a) {
        int length = a.length;

        int seqLen = SEQUENCE.length;
        int nth;
        int j;

        for(int seqi = seqLen - 1; seqi >= 0; seqi--) {
            if(SEQUENCE[seqi] < length / 3) {
                nth = SEQUENCE[seqi];
                for(int n = 0; n < length; n+=nth) {
                    j = n;
                    while(j > 0 && a[j] < a[j - nth]) {
                        exch(a, j, j-nth);
                        j -= nth;
                    }
                }
            }
        }
    }
}

Rest of the code for those, who would like to run test on their machine (doubling array size test with JVM heat up has no significant effect on the results, so this simple test is good enough for N > ~ 200 000).

main:

int N = 500_000;
Random rand = new Random();
int[] a = new int[N];
for(int i = 0; i < N; i++)
    a[i] = rand.nextInt();

//insertion sort
int[] aCopy = Arrays.copyOf(a, a.length);
long start = System.nanoTime();
new InsertionSort().sort(aCopy);
System.out.println("insert:\t" + (System.nanoTime() - start));

//shell sort
aCopy = Arrays.copyOf(a, a.length);
start = System.nanoTime();
new ShellSort().sort(aCopy);
System.out.println("shell:\t" + (System.nanoTime() - start));

InsertionSort and Sort classes:

class InsertionSort extends Sort {
    public void sort(int[] a) {
        int length = a.length;
        int j;
        int x;
        for(int i = 1; i < length; i++) {
            j = i;
            x = a[i];
            while(j > 0 && x < a[j-1]) {
                a[j] = a[--j];
            }
            a[j] = x;
        }
    }
}
abstract class Sort {
    abstract public void sort(int[] a);

    protected static final void exch(int[] a, int i, int j) {
        int t = a[i];
        a[i] = a[j];
        a[j] = t;
    }
}

Solution

  • Your implementation is broken and outputs the sorted array only due to the fact that the last step is 1 and your two internal cycles perform the basic insertion sort when the step is 1. When the step is greater then 1, the two internal cycles in your implementation do anything but step-sort the array, so what you implementation does is it shuffles the array in all iterations of the outer cycle and then insertion-sorts it in the last iteration of the outer cycle. Of course it will take longer then just insertion-sort it once.

    Reusing your sequence the proper shell sort implementation should look like this:

    public void sort( int[] a ) {
        int length = a.length;
    
        int stepIndex = 0;
        while ( stepIndex < SEQUENCE.length - 1 && SEQUENCE[ stepIndex ] < length / 3 ) {
            stepIndex++;
        }
    
        while ( stepIndex >= 0 ) {
            int step = SEQUENCE[ stepIndex-- ];
            for ( int i = step; i < length; i++ ) { // DIFF: i++ instead of i+=step
                for ( int j = i; j >= step && a[ j ] < a[ j - step ]; j -= step ) {
                    exch( a, j, j - step );
                }
            }
        }
    }
    

    Two main differences between this implementation and yours:

    • proper initial indexes for two internal cycles
    • proper index increment for middle cycle (+1 instead of +step in your code)

    Also, check the http://algs4.cs.princeton.edu/21elementary/Shell.java.html for a good implementation and good step sequence.