I have a df
that looks like this and I need to run a code to produce change
. change
is defined as the first time to permanent positive outcome
(outcome
= 1).
The logic is as follows:
ID
has 5 visits
with the value of the outcome
at each visit
change
variable can only be 1
if the outcome is 1
at visit
x and thereafterid
2 cannot have change
= 1 at time
2 because the outcome
reverts back to a negative result at time
3.id
3 at visit
2 could be 1 or 0. Since the value at this visit could be 1, then change
should be 1.My data with the desired output variable is
id visit outcome change
1 1 0 0
1 2 0 0
1 3 0 0
1 4 1 1
1 5 1 0
2 1 0 0
2 2 1 0
2 3 0 0
2 4 1 1
2 5 1 0
3 1 0 0
3 2 NA 1
3 3 1 1
3 4 1 0
3 5 1 0
You can do this easily with dplyr
:
library(dplyr)
df <- data.frame(id = rep(c(1,2,3), each = 5), visit = rep(1:5, 3),
outcome = c(0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, NA, 1,1,1))
df %>%
group_by(id) %>%
mutate(change = as.numeric(lead(outcome) == 1 & outcome == 1),
change = ifelse(visit == 5, 0, change),
change = ifelse(is.na(change), lead(change), change))