Search code examples
rdataframedplyrranking

Rank selected multiple columns in a dataframe and then add rank data as new columns


The example data looks like this:

players <- seq(1:10)
mean_K <- round(runif(10,1,10),2)
mean_D <- round(runif(10,0,10),2)
mean_A <- round(runif(10,0,10),2)
d <- cbind(players, mean_K, mean_A, mean_D)
d <- as_tibble(d)
d

Output:

# A tibble: 10 x 4
   players mean_K mean_A mean_D
     <dbl>  <dbl>  <dbl>  <dbl>
 1       1   3.58   9.21   3.4 
 2       2   3.8    0.49   1.33
 3       3   2.47   2.29   1.72
 4       4   1.52   7.25   6.11
 5       5   8.73   3.98   8.08
 6       6   8.08   9.77   3.88
 7       7   3.97   4.52   9.93
 8       8   2.84   6.83   9.57
 9       9   3.87   0.35   5.32
10      10   2.82   3.8    3.09

What I want to do is rank selected columns at the same time, I tried to use apply :

apply(d[,2:4], 2, rank)

Output:

      mean_K mean_A mean_D
 [1,]      5      9      4
 [2,]      6      2      1
 [3,]      2      3      2
 [4,]      1      8      7
 [5,]     10      5      8
 [6,]      9     10      5
 [7,]      8      6     10
 [8,]      4      7      9
 [9,]      7      1      6
[10,]      3      4      3

But then I was stuck, how can I add all the new rank columns to the dataframe d? And I wonder if it is possible to do it using dplyr pipes %>%?

My desired output might looks like this:

   players mean_K mean_K_ranking mean_D mean_D_ranking mean_A mean_A_ranking
1        1   3.58              5   3.40              4   9.21              9
2        2   3.80              6   1.33              1   0.49              2
3        3   2.47              2   1.72              2   2.29              3
4        4   1.52              1   6.11              7   7.25              8
5        5   8.73             10   8.08              8   3.98              5
6        6   8.08              9   3.88              5   9.77             10
7        7   3.97              8   9.93             10   4.52              6
8        8   2.84              4   9.57              9   6.83              7
9        9   3.87              7   5.32              6   0.35              1
10      10   2.82              3   3.09              3   3.80              4

Solution

  • We can assign. Here, we used lapply to loop over the columns (in case there are some difference in type, then lapply preserves it while apply converts to matrix and there would be a single type for those)

    lst1 <- lapply(d[2:4], rank)
    d[paste0(names(lst1), "_ranking")] <- lst1
    

    Or using tidyverse

    library(dplyr)
    d <- d %>%
         mutate(across(starts_with('mean'), rank, .names = "{.col}_ranking"))
    

    Or can use imap from purrr to order the columns as in the expected

    library(purrr)
    library(stringr)
    d %>% 
        select(starts_with('mean_')) %>% 
        imap_dfc(~ tibble(!! .y := .x, 
            !!str_c(.y, '_ranking') := rank(.x))) %>% 
        bind_cols(d %>% 
                   select(players), .)