I have data that looks like this:
intial<-tibble(start_date=rep(seq.Date(as.Date("2021-06-01"),as.Date("2021-10-01"),by="months"),4),end_date=rep(seq.Date(as.Date("2021-07-01"),as.Date("2021-11-01"),by="months"),4),id=rep(c(rep(1,5),rep(2,5)),2),group=c(rep("a",10),rep("b",10)),increase=c(c(4:8),c(5:9),c(6:10),c(7:11)),decrease=c(c(1:5),c(2:6),c(3:7),c(4:8)),start_count=c(c(10,13,16,19,NA),c(15,18,21,24,NA),c(20,23,26,29,NA),c(20,23,26,29,NA)),end_count=c(c(13,16,19,NA,NA),c(18,21,24,NA,NA),c(23,26,29,NA,NA),c(23,26,29,NA,NA)))
print(initial)
start_date end_date id group increase decrease start_count end_count
<date> <date> <dbl> <chr> <int> <int> <dbl> <dbl>
1 2021-06-01 2021-07-01 1 a 4 1 10 13
2 2021-07-01 2021-08-01 1 a 5 2 13 16
3 2021-08-01 2021-09-01 1 a 6 3 16 19
4 2021-09-01 2021-10-01 1 a 7 4 19 NA
5 2021-10-01 2021-11-01 1 a 8 5 NA NA
6 2021-06-01 2021-07-01 2 a 5 2 15 18
7 2021-07-01 2021-08-01 2 a 6 3 18 21
8 2021-08-01 2021-09-01 2 a 7 4 21 24
9 2021-09-01 2021-10-01 2 a 8 5 24 NA
10 2021-10-01 2021-11-01 2 a 9 6 NA NA
11 2021-06-01 2021-07-01 1 b 6 3 20 23
12 2021-07-01 2021-08-01 1 b 7 4 23 26
13 2021-08-01 2021-09-01 1 b 8 5 26 29
14 2021-09-01 2021-10-01 1 b 9 6 29 NA
15 2021-10-01 2021-11-01 1 b 10 7 NA NA
16 2021-06-01 2021-07-01 2 b 7 4 20 23
17 2021-07-01 2021-08-01 2 b 8 5 23 26
18 2021-08-01 2021-09-01 2 b 9 6 26 29
19 2021-09-01 2021-10-01 2 b 10 7 29 NA
20 2021-10-01 2021-11-01 2 b 11 8 NA NA
Where each unique combination of id
and group
contains an index of dates. I need to use the values in the increase
and decrease
fields to calculate and sequentially populate start_count
and end_count
over each date within each id
and group
combination.
end_count
is calculated by: start_count
+ increase
- decrease
, and start_count
is the previous date's end_count
.
Basically I want to end up with something that would look like the following for this example: final<-tibble(start_date=rep(seq.Date(as.Date("2021-06-01"),as.Date("2021-10-01"),by="months"),4),end_date=rep(seq.Date(as.Date("2021-07-01"),as.Date("2021-11-01"),by="months"),4),id=rep(c(rep(1,5),rep(2,5)),2),group=c(rep("a",10),rep("b",10)),increase=c(c(4:8),c(5:9),c(6:10),c(7:11)),decrease=c(c(1:5),c(2:6),c(3:7),c(4:8)),start_count=c(c(10,13,16,19,21),c(15,18,21,24,27),c(20,23,26,29,32),c(20,23,26,29,31)),end_count=c(c(13,16,19,21,24),c(18,21,24,27,30),c(23,26,29,32,35),c(23,26,29,31,34)))
print(final)
start_date end_date id group increase decrease start_count end_count
<date> <date> <dbl> <chr> <int> <int> <dbl> <dbl>
1 2021-06-01 2021-07-01 1 a 4 1 10 13
2 2021-07-01 2021-08-01 1 a 5 2 13 16
3 2021-08-01 2021-09-01 1 a 6 3 16 19
4 2021-09-01 2021-10-01 1 a 7 4 19 21
5 2021-10-01 2021-11-01 1 a 8 5 21 24
6 2021-06-01 2021-07-01 2 a 5 2 15 18
7 2021-07-01 2021-08-01 2 a 6 3 18 21
8 2021-08-01 2021-09-01 2 a 7 4 21 24
9 2021-09-01 2021-10-01 2 a 8 5 24 27
10 2021-10-01 2021-11-01 2 a 9 6 27 30
11 2021-06-01 2021-07-01 1 b 6 3 20 23
12 2021-07-01 2021-08-01 1 b 7 4 23 26
13 2021-08-01 2021-09-01 1 b 8 5 26 29
14 2021-09-01 2021-10-01 1 b 9 6 29 32
15 2021-10-01 2021-11-01 1 b 10 7 32 35
16 2021-06-01 2021-07-01 2 b 7 4 20 23
17 2021-07-01 2021-08-01 2 b 8 5 23 26
18 2021-08-01 2021-09-01 2 b 9 6 26 29
19 2021-09-01 2021-10-01 2 b 10 7 29 31
20 2021-10-01 2021-11-01 2 b 11 8 31 34
We could use
library(dplyr)
library(zoo)
intial %>%
group_by(id, group) %>%
mutate(start_count_new = coalesce(start_count,
zoo::na.locf0(start_count) + increase - decrease),
end_count_new = coalesce(lead(start_count_new),
lead(start_count_new, default = last(start_count_new)) + increase - decrease))