I have a dataset as follows
structure(list(id = c(1, 2, 2, 2), enrollment = c(2014, 2011,
2012, 2013), deregister = c(2016, 9999, 9999, 9999)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L))
I need to convert that dataset to the following one:
structure(list(id = c(1, 1, 1, 2, 2, 2), enrollment = c(2014,
2015, 2016, 2011, 2012, 2013), deregister = c(9999, 9999, 2016,
9999, 9999, 9999)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-6L))
The idea is: if deregister is not 9999, add a new row to dataset by adding 1 to the enrollment untill enrollment=degister. Encode newly added row's deregisters with 9999 until enrollment=degister.
Since I have a lot of observations, I want to create dataset without loops.
thanks.
You can use mapply
+ :
to create the sequences:
library(dplyr)
library(tidyr)
df %>%
mutate(enrollment = ifelse(deregister != 9999, mapply(`:`, enrollment, deregister), enrollment)) %>%
unnest_longer(enrollment) %>%
mutate(deregister = replace(deregister, enrollment != deregister, 9999))
# id enrollment deregister
# 1 1 2014 9999
# 2 1 2015 9999
# 3 1 2016 2016
# 4 2 2011 9999
# 5 2 2012 9999
# 6 2 2013 9999