Search code examples
apache-sparkrdd

How to reverse ordering for RDD.takeOrdered()?


What is the syntax to reverse the ordering for the takeOrdered() method of an RDD in Spark?

For bonus points, what is the syntax for custom-ordering for an RDD in Spark?


Solution

  • Reverse Order

    val seq = Seq(3,9,2,3,5,4)
    val rdd = sc.parallelize(seq,2)
    rdd.takeOrdered(2)(Ordering[Int].reverse)
    

    Result will be Array(9,5)

    Custom Order

    We will sort people by age.

    case class Person(name:String, age:Int)
    val people = Array(Person("bob", 30), Person("ann", 32), Person("carl", 19))
    val rdd = sc.parallelize(people,2)
    rdd.takeOrdered(1)(Ordering[Int].reverse.on(x=>x.age))
    

    Result will be Array(Person(ann,32))