Search code examples
rgroup

Group persons based on birth of year in R


I have the following dataset

df<- data.frame(x1=c(1,5,7,8,2,2,3,4,5,10),
birthyear=c(1992,1994,1993,1992,1995,1999,2000,2001,2000, 1994))

I want to group persons in 3-year intervals together so that persons born in 1992-1994 are group 1 and 1995-1997 are in group 2 and so on. I have a far larger dataset with over 10000 entries. How could I do it the most efficient way?


Solution

  • I would simply use cut with breaks defined with seq:

    df$group <- cut(df$birthyear,
                    seq(1992, 2022, 3),
                    labels = F,
                    right = F)
    df
    

    Output:

    #>    x1 birthyear group
    #> 1   1      1992     1
    #> 2   5      1994     1
    #> 3   7      1993     1
    #> 4   8      1992     1
    #> 5   2      1995     2
    #> 6   2      1999     3
    #> 7   3      2000     3
    #> 8   4      2001     4
    #> 9   5      2000     3
    #> 10 10      1994     1
    

    Created on 2022-05-03 by the reprex package (v2.0.1)