I´m trying to set up two new variables to incorporate into an existing data.frame which should be a running value starting at 1 (0) if a condition is met with respect to the IDs in the data.frame. So the data.frame is of similar structure to this:
ID Var1
1 0
1 2
1 5
1 12
2 0
2 2
2 NA
2 11
and I want to get to:
ID Var1 start stop
1 0 0 0
1 2 0 1
1 5 1 2
1 12 2 3
2 0 0 0
2 2 0 1
2 NA 1 2
2 11 2 3
Start should be a running value, starting once Var1 > 0
for the first time and stop should operate the same way. Start´s starting value should be 0 and stop´s starting value should be 1. It should further continue running, if Var1 takes on NA
or 0
again in the course of the data.frame. I have tried doing the following:
df %>%
group_by(ID) %>%
mutate(stop = ifelse(Var1 > 0,
0:nrow(df), 0))
But the variable it returns doesn´t start with 0, but with the number of the row the condition is first met in.
Here is base R option using ave
+ replace
transform(df,
Start = ave(ave(replace(Var1, is.na(Var1), 0) > 0, ID, FUN = cumsum) > 0, ID, FUN = function(x) cumsum(c(0, x))[-(length(x) + 1)]),
Stop = ave(ave(replace(Var1, is.na(Var1), 0) > 0, ID, FUN = cumsum) > 0, ID, FUN = cumsum)
)
or
transform(df,
Start = ave(ave(ave(replace(Var1, is.na(Var1), 0) > 0, ID, FUN = cumsum), ID, FUN = cumsum) > 1, ID, FUN = cumsum),
Stop = ave(ave(replace(Var1, is.na(Var1), 0) > 0, ID, FUN = cumsum) > 0, ID, FUN = cumsum)
)
which gives
ID Var1 Start Stop
1 1 0 0 0
2 1 2 0 1
3 1 5 1 2
4 1 12 2 3
5 2 0 0 0
6 2 2 0 1
7 2 NA 1 2
8 2 11 2 3