Search code examples
javaquicksort

Quicksort Stability Demo


I understand that quicksort is indeed unstable (incase of primitives)because of the way partitioning works/long-distance exchanges. I am trying to understand what happens if quicksort is used to sort complex objects with equal keys. Essentially why java Collections.sort doesn't use Quicksort.

Here is a demo app that I created to aid my understanding. As per this app, objects with equal keys seem to retain their input order. I know that I have some understanding gaps here. I did search on the web but most examples are based on integer sorting.

Please help me understand quicksort stability issues.

DEMO

import java.util.*;

public class QuickSortStabilityDemo {

    static class Node implements Comparable<Node> {
        String name;
        int rank;

        public Node(String name, int rank) {
            this.name =  name;
            this.rank = rank;
        }

        @Override
        public int compareTo(Node o) {
            int result = this.name.compareTo(o.name);
            if(result == 0) {
                return this.rank == o.rank ? 0 : this.rank < o.rank ? -1: 1;
            }
            else {
                return result;
            }
        }

        @Override
        public String toString() {
            return "{" + this.name + "," + this.rank + "," + this.hashCode() + "}" ;
        }
    }

    //Fisher-Yates 
        public void shuffleArray(Node[] arr) {
            Random random = new Random();
            int n = arr.length;

            for(int i=n-1; i>=0; i--) {
                int j = random.nextInt(i+1);
                Node temp = arr[i];
                arr[i]= arr[j];
                arr[j]=temp;
            }

        }

        private void swap(Node[] arr, int i, int j) {
            Node temp = arr[i];
            arr[i] = arr[j];
            arr[j] = temp;
        }

        public void sort(Node[] arr, int start, int end) {  

            if(start >= end) {
                return;
            }



            Node pivot = arr[start];

            int lt = start;
            int gt = end;


            for(int current=start+1;current <= gt; ) {

                if(arr[current].compareTo(pivot) < 0) {
                    swap(arr,current,lt);  
                    current++; 
                    lt++; 
                }
                else if(arr[current].compareTo(pivot) > 0) {
                    swap(arr,current,gt); 
                    gt--; 
                }
                else {
                    current++;
                }

            }
            sort(arr,start,lt-1);
            sort(arr,gt+1,end);


        }

        public static void main(String[] args) {
            QuickSortStabilityDemo sort = new QuickSortStabilityDemo();
            String[] cities = {"New York","Jersey City","Pittsburgh"};

            List<Node> list = new ArrayList<>();

            for(int i=0;i <3;i++) {
                for(int j=1; j <=3; j++) {
                    list.add(new Node(cities[i],i));
                }
            }

            Node[] arr = list.toArray(new Node[list.size()]);

            System.out.println("Before sorting...");

            System.out.println(Arrays.toString(arr));

            sort.sort(arr,0,arr.length-1);

            System.out.println("After sorting...");


            System.out.println(Arrays.toString(arr));
        }

}

Solution

  • I found the answer:

    In the original demo I posted, the data was a bit contrived. The properties of objects within each set are same and is done purposefully. I did not shuffle the array; set the pivot to the start element of portion of array that is being sorted.

    As I debugged my demo further, event though NY and JC objects retained their original order Pgh objects did change their original insertion order. Therefore I did see the instability of the algorithm.

    I used hashcode of these elements to track their original insertion order.

    Here is the output from a run:

    [{New York,0,1163157884}
    , {New York,0,1956725890}
    , {New York,0,356573597}
    , {Jersey City,1,1735600054}
    , {Jersey City,1,21685669}
    , {Jersey City,1,2133927002}
    , {Pittsburgh,2,1836019240}
    , {Pittsburgh,2,325040804}
    , {Pittsburgh,2,1173230247}
    ]
    After sorting
    [{Jersey City,1,1735600054}
    , {Jersey City,1,21685669}
    , {Jersey City,1,2133927002}
    , {New York,0,1163157884}
    , {New York,0,1956725890}
    , {New York,0,356573597}
    , {Pittsburgh,2,325040804}
    , {Pittsburgh,2,1173230247}
    , {Pittsburgh,2,1836019240}
    ]
    

    If I shuffle the input array, the instability of the algorithm is more apparent.

    Here is the output from a run (with shuffled input):

    Original order
    [{New York,0,1163157884}
    , {New York,0,1956725890}
    , {New York,0,356573597}
    , {Jersey City,1,1735600054}
    , {Jersey City,1,21685669}
    , {Jersey City,1,2133927002}
    , {Pittsburgh,2,1836019240}
    , {Pittsburgh,2,325040804}
    , {Pittsburgh,2,1173230247}
    ]
    After shuffling
    [{New York,0,1163157884}
    , {New York,0,1956725890}
    , {Pittsburgh,2,325040804}
    , {Jersey City,1,2133927002}
    , {New York,0,356573597}
    , {Jersey City,1,1735600054}
    , {Pittsburgh,2,1836019240}
    , {Pittsburgh,2,1173230247}
    , {Jersey City,1,21685669}
    ]
    After sorting
    [{Jersey City,1,21685669}
    , {Jersey City,1,2133927002}
    , {Jersey City,1,1735600054}
    , {New York,0,1956725890}
    , {New York,0,356573597}
    , {New York,0,1163157884}
    , {Pittsburgh,2,1173230247}
    , {Pittsburgh,2,1836019240}
    , {Pittsburgh,2,325040804}
    ]
    

    Please let me know if any suggestions on this answer.