Search code examples
arraysscalasortingquicksort

Sort an array[(int, int)] in order to a list in scala


I am currently working on Spark framework using Scala. I have an array C: Array[(int, int)] to which i want to sort according to order of a map A:Map[int,int]. The code is given below:

line 1:import scala.util.Sorting
line 2: object Test {
line 3:     def main(args: Array[String]){
line 4:     val A = Map(5 -> 41, 1 -> 43, 2 -> 41, 3 -> 59,4 -> 51 )
line 5:     val B= A.toList.sortBy(x => x._2)
line 6:     B.foreach(println)
line 7:     val C=Array((1,15),(2,9),(3,6),(4,3),(5,4))
line 8:     Sorting.quickSort(C)(Ordering[(Int)].on(k => (A.get(k._1).get)))
line 9:     for(j <- 0 to C.length-1){
line 10:        println(C(j))
line 11:       }
line 12:    }
line 13: }

When I sort the A according the value part (line 5) and print, i get the output (line 6) like this:

(5,41)
(2,41)
(1,43)
(4,51)
(3,59)

But when i sort C according the key part of A (line 8), i get the following output (line 9-11)

(2,9)
(5,4)
(1,15)
(4,3)
(3,6)

As both (5,41) and (2,41) have same value and, in B (5,41) appears before (2,41). But after sorting C according to key part of A, why 2 is appearing before 5. Why is the result not uniform in both B and C?

Please suggest me the changes in the code to get a uniform result in both B and C. I want the C as: (5,4) (2,9) (1,15) (4,3) (3,6).


Solution

  • Map is by-design unordered hence sorting by value alone may result in nondeterministic order. If your business logic allows changing of the sorting criteria, consider making it consistent by ordering by value + key on both the Map and Array:

    import scala.util.Sorting
    
    val A = Map(5 -> 41, 1 -> 43, 2 -> 41, 3 -> 59, 4 -> 51 )
    val B = A.toList.sortBy(x => (x._2, x._1))
    
    B
    // res1: List[(Int, Int)] = List((2,41), (5,41), (1,43), (4,51), (3,59))
    
    val C = Array((1,15),(2,9),(3,6),(4,3),(5,4))
    
    Sorting.quickSort(C)(Ordering[(Int, Int)].on( k => (A.get(k._1).get, k._1) ))
    
    C
    // res23: Array[(Int, Int)] = Array((2,9), (5,4), (1,15), (4,3), (3,6))
    

    As a side note, to avoid crash due to key not found, consider using gerOrElse with a default (e.g. Int.MaxValue, etc):

    Sorting.quickSort(C)(Ordering[(Int, Int)].on( k => (A.getOrElse(k._1, Int.MaxValue), k._1) ))