Doing Scalding MapReduce operations I need to compare tuples using my own comparison function on tuple fields.
Questions:
Thanks!
You can create virtual field (e.g. by using com.twitter.scalding.RichPipe#map
), sort by this field and then take it away. Here is an example based on the Scalding Documentation:
val users = Csv(file_source, separator = ",", fields = Schema)
.read
.map ('age-> 'ageInt) {x:Int => x}
.groupAll { _.sortBy('ageInt) } // will sort age as a number.
.discard ('ageInt)