I have a simple dataset, with date
, sample
, and application
. Every day the value in the sample
should decrease by 1
and increase by whatever is applied.
When we take another sample
we start over from the new value.
The result is given in the variable estimate
. I'm trying to calculate the variable estimate
in R
.
Here's my dataset:
structure(
list(
date = structure(c(17591, 17592, 17593, 17594,
17595, 17596, 17597, 17598, 17599, 17600, 17601, 17602, 17603,
17604, 17605, 17606, 17607, 17608, 17609, 17610, 17611, 17612,
17613, 17614), class = "Date"),
sample = c(30, NA, NA, NA, NA, 20, NA, NA, 40, NA, NA,
NA, NA, NA, 35, NA, NA, 15, NA, NA, NA,
NA, 5, NA),
applied = c(NA, NA, 10, NA, NA, NA, NA, 10, NA, NA,
15, NA, NA, NA, 10, NA, NA, NA, 15, NA, NA, 10, NA, NA),
estimate = c(30, 29, 38, 37, 36, 20, 19, 28, 40, 39, 53, 52,
51, 50, 35, 34, 33, 15, 29, 28, 27, 36, 5, 4)
),
class = "data.frame", row.names = c(NA, -24L)
)
This could probably be cleaned up a little, but it works:
library(dplyr)
df %>%
mutate(group = cumsum(!is.na(sample))) %>%
group_by(group) %>%
mutate(
to_add = ifelse(!is.na(sample), 0, ifelse(is.na(applied), -1, applied - 1)),
result = first(sample) + cumsum(to_add)
)
# # A tibble: 24 x 7
# # Groups: group [6]
# date sample applied estimate group to_add result
# <date> <dbl> <dbl> <dbl> <int> <dbl> <dbl>
# 1 2018-03-01 30 NA 30 1 0 30
# 2 2018-03-02 NA NA 29 1 -1 29
# 3 2018-03-03 NA 10 38 1 9 38
# 4 2018-03-04 NA NA 37 1 -1 37
# 5 2018-03-05 NA NA 36 1 -1 36
# 6 2018-03-06 20 NA 20 2 0 20
# 7 2018-03-07 NA NA 19 2 -1 19
# 8 2018-03-08 NA 10 28 2 9 28
# 9 2018-03-09 40 NA 40 3 0 40
# 10 2018-03-10 NA NA 39 3 -1 39
# # ... with 14 more rows