I am a little confused. I've been trying to test quicksort and everything seems to be working fine, except, when I have many duplicate array elements, I am a getting the unexpected result.
nextInt
method to fill the array. When I decrease the data range (which consequently puts more duplicates in the array), Quick sort runs faster.
1million elements (range 0 - 1 million): 155 ms
1million elements (range 0 - 2): 118 ms
30million elements (range 0-1 million): 3085 ms
30million elements (range 0-2): 1021 ms
I tried this many times, so it doesn't seem like it's due to the randomization. The difference gets even bolder with larger data.
I couldn't find any explanation why this might occur.
Here is my code:
public void qsort2median(int array [], int lowest, int highest)
{
int pivot = array[(lowest+highest)/2];
int left = lowest;
int right = highest;
while (left <= right){
while (array[left] < pivot){
left ++;
}
while (array[right] > pivot){
right--;
}
if (left <= right){
int temp = array [left];
array[left] = array[right];
array[right] = temp;
left ++;
right--;
}
}
if (lowest < right) {
qsort2median(array, lowest, right);
}
if (left < highest) {
qsort2median(array, left, highest);
}
}
and here is how I run it:
int [] arraytest = new int [1000000];
int [] arraytest1 = new int [1000000];
Quicksort qsorttest = new Quicksort();
qsorttest.randomArrayGenerator(arraytest,2);
System.arraycopy(arraytest,0,arraytest1,0,arraytest.length);
int endpoint = arraytest.length-1;
//RUN FIRST ARRAY
long startTime = System.currentTimeMillis();
qsorttest.qsort2median(arraytest,0,endpoint);
long endTime = System.currentTimeMillis();
long duration = (endTime - startTime);
System.out.println("Quicksort with Median Element pivot ran in: " + duration);
and here is my randomArrayGenerator
method:
void randomArrayGenerator(int [] array, int n){
Random generator = new Random();
System.out.println();
System.out.println(array.length);
for (int i = 0; i < array.length; i++){
array[i]= generator.nextInt(n);
}
}
Thanks for any input.
As of Monday, I am still trying to figure this out.
I guess I am allowed to answer my own question.
Upon further research, I discovered that duplicates only become a problem when Quicksort algorithm uses Lomuto's partitioning. This, on the other hand, uses Hoare's partitioning.
I tested both of them and got the expected result.
So Lomuto's partitioning gets slower with increased duplicates and Hoare's algorithm gets faster with duplicates.