I have the following starting point:
#dataset:
schedule <- tibble(start = as.Date(c("2018-07-11", NA, NA)), duration = c(10,23,9),flag_StartActual = c(TRUE,FALSE,FALSE))
in table format:
> schedule
# A tibble: 3 x 3
start duration flag_StartActual
<date> <dbl> <lgl>
1 2018-07-11 10 TRUE
2 NA 23 FALSE
3 NA 9 FALSE
I would like to compute the end
(= start + duration). Once that is done for the first row, I would like to make the end
of the first row the start
of the second.
I have experimented with various approaches but so far I have not been successful. The things I have considered is:
end
from the previous row with the lag(end) function. This works fine for the second row, but for all the following rows the end
does not exist yet. rowwise()
but in this case I do not get the lag()
function to work. The following code does more or less what I would like it to do but this is not very neat as a mutate needs to be added for every row (and then all the previous rows are recalculated).
> schedule %>%
+ mutate(
+ end = start + ddays(duration),
+ start = as_datetime(ifelse(flag_StartActual==TRUE,start,lag(end)))
+ ) %>%
+ mutate(
+ end = start + ddays(duration),
+ start = as_datetime(ifelse(flag_StartActual==TRUE,start,lag(end)))
+ )
# A tibble: 3 x 4
start duration flag_StartActual end
<dttm> <dbl> <lgl> <dttm>
1 2018-07-11 00:00:00 10 TRUE 2018-07-21 00:00:00
2 2018-07-21 00:00:00 23 FALSE 2018-08-13 00:00:00
3 2018-08-13 00:00:00 9 FALSE NA
Including rowwise()
in the code like below does not work:
schedule %>%
rowwise() %>%
mutate(
end = start + ddays(duration),
start = as_datetime(ifelse(flag_StartActual==TRUE,start,lag(end)))
)
Anyway, I am a bit stuck and hope that somebody has some smart ideas on how to approach this?
library(dplyr)
schedule %>%
mutate(start = start[1] + lag(cumsum(duration), default = 0)
, end = start + duration)
# # A tibble: 3 x 4
# start duration flag_StartActual end
# <date> <dbl> <lgl> <date>
# 1 2018-07-11 10.0 T 2018-07-21
# 2 2018-07-21 23.0 F 2018-08-13
# 3 2018-08-13 9.00 F 2018-08-22