Search code examples
rranking

R: ranking numerical data in a data.frame


mydata <- data.frame(Train = c(14.2, 2.2, 11.9), Test = c(10, 11.2, 12))
rownames(mydata) <- c("Method1", "Method2", "Method3")
> mydata
        Train Test
Method1  14.2 10.0
Method2   2.2 11.2
Method3  11.9 12.0

I want to rank my Train and Test data as follows:

> mydata
        Train Test Train_rank Test_rank
Method1  14.2 10.0          3         1
Method2   2.2 11.2          1         2
Method3  11.9 12.0          2         3

I've tried the following:

library(plyr)
ddply(mydata, .(stat), transform,
      Train_rank = rank(Train),
      Test_rank = rank(Test),
)

but I'm getting the following error:

Error in unique.default(x) : unique() applies only to vectors

Solution

  • Using tidyverse, we can use mutate with across (from dplyr 1.0.0or earlier versions withmutate_at/mutate_all`)

    library(dplyr)# 1.0.0
    mydata %>% 
        mutate(across(everything(), rank, .names = "{col}_rank"))
    #  Train Test Train_rank Test_rank
    #1  14.2 10.0          3         1
    #2   2.2 11.2          1         2
    #3  11.9 12.0          2         3
    

    If we need to keep the row.names (which tidyverse omits), create a column with row names (rownames_to_column) and later change the column to rownames

    library(tibble)
    mydata %>%
      rownames_to_column('rn') %>%
      mutate(across(-rn, rank, .names = "{col}_rank")) %>%
      column_to_rownames('rn')
    #         Train Test Train_rank Test_rank
    #Method1  14.2 10.0          3         1
    #Method2   2.2 11.2          1         2
    #Method3  11.9 12.0          2         3
    

    Or with base R

    mydata[paste0(names(mydata), "_rank")] <- lapply(mydata, rank)