I have the following dataset
df<- data.frame(x1=c(1,5,7,8,2,2,3,4,5,10),
birthyear=c(1992,1994,1993,1992,1995,1999,2000,2001,2000, 1994))
I want to group persons in 3-year intervals together so that persons born in 1992-1994 are group 1 and 1995-1997 are in group 2 and so on. I have a far larger dataset with over 10000 entries. How could I do it the most efficient way?
I would simply use cut
with breaks defined with seq
:
df$group <- cut(df$birthyear,
seq(1992, 2022, 3),
labels = F,
right = F)
df
Output:
#> x1 birthyear group
#> 1 1 1992 1
#> 2 5 1994 1
#> 3 7 1993 1
#> 4 8 1992 1
#> 5 2 1995 2
#> 6 2 1999 3
#> 7 3 2000 3
#> 8 4 2001 4
#> 9 5 2000 3
#> 10 10 1994 1
Created on 2022-05-03 by the reprex package (v2.0.1)