Search code examples
scalasortingcollectionsranking

rank values strategies in scala collection


I search the best way (i don't find this into current api, but perhaps i mistake) to compute different type of ranking for scala collection like IndexedSeq (like this different strategies in R : http://stat.ethz.ch/R-manual/R-devel/library/base/html/rank.html )

val tabToRank = IndexedSeq(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)

For example, "first rank strategy" equal to first occurence win, return

tabToRank.rank("first")
# return (4,1,6,2,7,11,3,10,8,5,9)

For example, i have this case of study : if you have a list of city with population (a vector data like tabToRank) at final state of simulation, i need to a) rank and b) sort cities by rank to plot a graphic like "rank of city by population" equal to the well know rank size distribution (src of img) :

a rank size distribution


Solution

  • For the city data, you want

    citipop.sortBy(x => -x).zipWithIndex.map(_.swap)
    

    to first sort the populations largest first (default is smallest first, so we sort the negative), then number them, and then get the number first and the population second.

    Scala doesn't have a built-in statistical library, however. In general, you'll have to know what you want to do and do it yourself or use a Java library (e.g. Apache Commons Math).