Search code examples
rdplyrtidyversetidy

R : How to extract the factor levels as numeric from a column and assign it to a new column using tydyverse?


Suppose I have a data frame, df

df = data.frame(name = rep(c("A", "B", "C"), each = 4))

I want to get a new data frame with one additional column named Group, in which Group element is the numeric value of the corresponding level of name, as shown in df2.

I know case_when could do it. My issue is that my real data frame is quite complicated, there are many levels of the name column. I am too lazy to list case by case.

Is there an easier and smarter way to do it?

Thanks.

df2
   name Group
1     A     1
2     A     1
3     A     1
4     A     1
5     B     2
6     B     2
7     B     2
8     B     2
9     C     3
10    C     3
11    C     3
12    C     3

Solution

  • There are a few ways to do it in tidyverse

    library(tidyverse)
    
    df %>% group_by(name) %>% mutate(Group = cur_group_id())
    

    or

    df %>% mutate(Group = as.numeric(as.factor(name)))
    

    Output

      name Group
    1     A  1
    2     A  1
    3     A  1
    4     A  1
    5     B  2
    6     B  2
    7     B  2
    8     B  2
    9     C  3
    10    C  3
    11    C  3
    12    C  3