I understand that quicksort is indeed unstable (incase of primitives)because of the way partitioning works/long-distance exchanges. I am trying to understand what happens if quicksort is used to sort complex objects with equal keys. Essentially why java Collections.sort
doesn't use Quicksort.
Here is a demo app that I created to aid my understanding. As per this app, objects with equal keys seem to retain their input order. I know that I have some understanding gaps here. I did search on the web but most examples are based on integer sorting.
Please help me understand quicksort stability issues.
DEMO
import java.util.*;
public class QuickSortStabilityDemo {
static class Node implements Comparable<Node> {
String name;
int rank;
public Node(String name, int rank) {
this.name = name;
this.rank = rank;
}
@Override
public int compareTo(Node o) {
int result = this.name.compareTo(o.name);
if(result == 0) {
return this.rank == o.rank ? 0 : this.rank < o.rank ? -1: 1;
}
else {
return result;
}
}
@Override
public String toString() {
return "{" + this.name + "," + this.rank + "," + this.hashCode() + "}" ;
}
}
//Fisher-Yates
public void shuffleArray(Node[] arr) {
Random random = new Random();
int n = arr.length;
for(int i=n-1; i>=0; i--) {
int j = random.nextInt(i+1);
Node temp = arr[i];
arr[i]= arr[j];
arr[j]=temp;
}
}
private void swap(Node[] arr, int i, int j) {
Node temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
public void sort(Node[] arr, int start, int end) {
if(start >= end) {
return;
}
Node pivot = arr[start];
int lt = start;
int gt = end;
for(int current=start+1;current <= gt; ) {
if(arr[current].compareTo(pivot) < 0) {
swap(arr,current,lt);
current++;
lt++;
}
else if(arr[current].compareTo(pivot) > 0) {
swap(arr,current,gt);
gt--;
}
else {
current++;
}
}
sort(arr,start,lt-1);
sort(arr,gt+1,end);
}
public static void main(String[] args) {
QuickSortStabilityDemo sort = new QuickSortStabilityDemo();
String[] cities = {"New York","Jersey City","Pittsburgh"};
List<Node> list = new ArrayList<>();
for(int i=0;i <3;i++) {
for(int j=1; j <=3; j++) {
list.add(new Node(cities[i],i));
}
}
Node[] arr = list.toArray(new Node[list.size()]);
System.out.println("Before sorting...");
System.out.println(Arrays.toString(arr));
sort.sort(arr,0,arr.length-1);
System.out.println("After sorting...");
System.out.println(Arrays.toString(arr));
}
}
I found the answer:
In the original demo I posted, the data was a bit contrived. The properties of objects within each set are same and is done purposefully. I did not shuffle the array; set the pivot to the start element of portion of array that is being sorted.
As I debugged my demo further, event though NY and JC objects retained their original order Pgh objects did change their original insertion order. Therefore I did see the instability of the algorithm.
I used hashcode of these elements to track their original insertion order.
Here is the output from a run:
[{New York,0,1163157884}
, {New York,0,1956725890}
, {New York,0,356573597}
, {Jersey City,1,1735600054}
, {Jersey City,1,21685669}
, {Jersey City,1,2133927002}
, {Pittsburgh,2,1836019240}
, {Pittsburgh,2,325040804}
, {Pittsburgh,2,1173230247}
]
After sorting
[{Jersey City,1,1735600054}
, {Jersey City,1,21685669}
, {Jersey City,1,2133927002}
, {New York,0,1163157884}
, {New York,0,1956725890}
, {New York,0,356573597}
, {Pittsburgh,2,325040804}
, {Pittsburgh,2,1173230247}
, {Pittsburgh,2,1836019240}
]
If I shuffle the input array, the instability of the algorithm is more apparent.
Here is the output from a run (with shuffled input):
Original order
[{New York,0,1163157884}
, {New York,0,1956725890}
, {New York,0,356573597}
, {Jersey City,1,1735600054}
, {Jersey City,1,21685669}
, {Jersey City,1,2133927002}
, {Pittsburgh,2,1836019240}
, {Pittsburgh,2,325040804}
, {Pittsburgh,2,1173230247}
]
After shuffling
[{New York,0,1163157884}
, {New York,0,1956725890}
, {Pittsburgh,2,325040804}
, {Jersey City,1,2133927002}
, {New York,0,356573597}
, {Jersey City,1,1735600054}
, {Pittsburgh,2,1836019240}
, {Pittsburgh,2,1173230247}
, {Jersey City,1,21685669}
]
After sorting
[{Jersey City,1,21685669}
, {Jersey City,1,2133927002}
, {Jersey City,1,1735600054}
, {New York,0,1956725890}
, {New York,0,356573597}
, {New York,0,1163157884}
, {Pittsburgh,2,1173230247}
, {Pittsburgh,2,1836019240}
, {Pittsburgh,2,325040804}
]
Please let me know if any suggestions on this answer.