Search code examples
sortingcollectionsmergesortstable-sorttimsort

A solid example (or some business use case) where Stable sort makes a significant difference


I want to know the scenario where Stable sorting will make a huge impact.

Previous versions of JAVA had Merge sort for collections.sor API which is a stable sort while for Array.sort, quicksort was used. Current versions of Java use Tim Sort which is again stable sort. So nowadays if you will see most of the popular languages like Python, Java, Scala are using Tim Sort. I want to know how much does it weigh for Tim Sort being stable sort in its usage. What's the strong motivation that's driving use of stable sorting techniques?


Solution

  • With a stable sort, a data set can be sorted by one field at a time, from least significant to most significant. For example some spreadsheet programs have a limitation of sorting by 3 fields at a time. Since spreadsheet sorts are stable, then a 6 field sort is possible, first sorting by the 3 least significant fields, then sorting by the 3 most significant fields. Preserving the original order might be a desired side effect. Say a data set is sorted by names, and then a copy of that data set is sorted by birth date, elements with the same birth date will retain their original name ordering, without the need for a complex compare.

    There's also a performance issue for quick sort versus merge sort. Typically, merge sort does more moves, but fewer compares. If the compare overhead is greater than the move overhead, merge sort is faster. For example, if sorting an array of pointers to objects (which is how I think Java implements an array of objects), then merge sort can be faster, since what is being moved are the pointers, and what is being compared are the objects.