Search code examples
rranking

Adding multiple ranking columns in a dataframe in R


I have the following df:

id city uf home money work
34  LA  RJ  10     2     2   
33  BA  TY  7      3     65
32  NY  BN  4      5     4
12  SP  SD  3      9     7
14  FR  DE  1      8     9
17  BL  DE  5      10    8

DESIRED DF:

id city uf home  rank_home  money    rank_money   work   rank_work
34  LA  RJ  10    1          2         6           2       6
33  BA  TY  7     2          3         5           65      1
32  NY  BN  4     4          5         4           4       5
12  SP  SD  3     5          9         2           7       4
14  FR  DE  1     6          8         3           9       2
17  BL  DE  5     3          10        1           8       3

I know this is possible: dat$rank_home <- rank(dat$home)

But I want a cleaner code for multiple columns!

Thank you!!


Solution

  • We can loop across the columns 'home' to 'work', apply the rank, while creating new column by adding prefix in .names, and probably select to keep the order

    library(dplyr)
    df1 <- df %>% 
       mutate(across(home:work, ~ rank(-.), .names = "rank_{.col}"))
    

    Or may do this in a loop where it is more flexible in placing the column at a particular position by specifying either .after or .before. Note that we used compound assignment operator (%<>% from magrittr) to do the assignment in place

    library(magrittr)
    library(stringr)
    for(nm in names(df)[4:6]) df %<>%
         mutate(!!str_c("rank_", nm) := rank(-.data[[nm]]), .after = all_of(nm))
    

    -output

    df
      id city uf home rank_home money rank_money work rank_work
    1 34   LA RJ   10         1     2          6    2         6
    2 33   BA TY    7         2     3          5   65         1
    3 32   NY BN    4         4     5          4    4         5
    4 12   SP SD    3         5     9          2    7         4
    5 14   FR DE    1         6     8          3    9         2
    6 17   BL DE    5         3    10          1    8         3
    

    NOTE: If the column have ties, then the default method use is "average". So, ties.method can also be an argument in the rank where there are ties.

    data

    df <- structure(list(id = c(34L, 33L, 32L, 12L, 14L, 17L), city = c("LA", 
    "BA", "NY", "SP", "FR", "BL"), uf = c("RJ", "TY", "BN", "SD", 
    "DE", "DE"), home = c(10L, 7L, 4L, 3L, 1L, 5L), money = c(2L, 
    3L, 5L, 9L, 8L, 10L), work = c(2L, 65L, 4L, 7L, 9L, 8L)), 
    class = "data.frame", row.names = c(NA, 
    -6L))