Search code examples
rdplyrgroup-bylevels

R: Get ranking of factor levels by group


For now I have only the df with the columns Number and Days.

I want to get a ranking of the factor levels of df$Days in a separate column which is called Ranking.

    df <- data.frame(Number = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3),
                 Days = c(5,5,10,10,15,3,3,3,5,5,11,11,13,13,13),
                 Ranking = c(1,1,2,2,3,1,1,1,2,2,1,1,2,2,2))

My approach would be to group the data by Days and then mutate then the new column, but then I´m stuck how to assign the ranking to the new column

library(dplyr)
df_new <- df %>%
   dplyr::group_by(Days) %>%
   dplyr::mutate(Ranking = count(unique(levels(Days))) # This does not work obviously

Can you help me with that question? The code should work for any number of factor levels (could be up to 20 different Days)

Thank you very much in advance!


Solution

  • Use dplyr::dense_rank, or as.numeric(factor(Days, ordered = T)) in base R:

    df %>% 
      group_by(Number) %>% 
      mutate(Ranking = dense_rank(Days),
             Ranking2 = as.numeric(factor(Days, ordered = T)))
    

    output

    # A tibble: 15 × 4
    # Groups:   Number [3]
       Number  Days Ranking Ranking2
        <dbl> <dbl>   <int>    <dbl>
     1      1     5       1        1
     2      1     5       1        1
     3      1    10       2        2
     4      1    10       2        2
     5      1    15       3        3
     6      2     3       1        1
     7      2     3       1        1
     8      2     3       1        1
     9      2     5       2        2
    10      2     5       2        2
    11      3    11       1        1
    12      3    11       1        1
    13      3    13       2        2
    14      3    13       2        2
    15      3    13       2        2