Search code examples
javaarraysperformancequicksort

Quicksort gets faster with duplicate keys (without three way partitioning). What is going on?


I am a little confused. I've been trying to test quicksort and everything seems to be working fine, except, when I have many duplicate array elements, I am a getting the unexpected result.

  • The code actually sorts the array.
  • The code actually sorts the array within a time frame that is expected of quicksort.
  • When the array is sorted or reverse sorted I get the expected inefficiency (This, of course when the pivot is set to the first element)
  • I use the Random library nextInt method to fill the array.

When I decrease the data range (which consequently puts more duplicates in the array), Quick sort runs faster.

1million elements (range 0 - 1 million): 155 ms

1million elements (range 0 - 2): 118 ms

30million elements (range 0-1 million): 3085 ms

30million elements (range 0-2): 1021 ms

I tried this many times, so it doesn't seem like it's due to the randomization. The difference gets even bolder with larger data.

I couldn't find any explanation why this might occur.

Here is my code:

public void qsort2median(int array [], int lowest, int highest)
    {
        int pivot = array[(lowest+highest)/2];
        int left = lowest;
        int right = highest;

        while (left <= right){
            while (array[left] < pivot){
                left ++;
            }
            while (array[right] > pivot){
                right--;
            }
            if (left <= right){
                int temp = array [left];
                array[left] = array[right];
                array[right] = temp;
                left ++;
                right--;
            }
        }
        if (lowest < right) {
            qsort2median(array, lowest, right);
        }
        if (left < highest) {
            qsort2median(array, left, highest);
        }
    }

and here is how I run it:

    int [] arraytest = new int  [1000000];
    int [] arraytest1 = new int [1000000];

    Quicksort qsorttest = new Quicksort();

    qsorttest.randomArrayGenerator(arraytest,2);

    System.arraycopy(arraytest,0,arraytest1,0,arraytest.length);
    int endpoint = arraytest.length-1;

    //RUN FIRST ARRAY
    long startTime = System.currentTimeMillis();
    qsorttest.qsort2median(arraytest,0,endpoint);
    long endTime = System.currentTimeMillis();

    long duration = (endTime - startTime); 
    System.out.println("Quicksort with Median Element pivot ran in: " + duration);

and here is my randomArrayGenerator method:

void randomArrayGenerator(int [] array, int n){

    Random generator = new Random();

    System.out.println();
    System.out.println(array.length);

    for (int i = 0; i < array.length; i++){

        array[i]= generator.nextInt(n);
    }
}

Thanks for any input.

As of Monday, I am still trying to figure this out.


Solution

  • I guess I am allowed to answer my own question.

    Upon further research, I discovered that duplicates only become a problem when Quicksort algorithm uses Lomuto's partitioning. This, on the other hand, uses Hoare's partitioning.

    I tested both of them and got the expected result.

    So Lomuto's partitioning gets slower with increased duplicates and Hoare's algorithm gets faster with duplicates.