I need to sum values correspoding to a certain range of dates, and I need to do this for many subjects.
In the next example, df1 contains starting and ending dates for three subjects, and df2 contains dates and their corresponding values:
df1<-data.frame(sub=c("a","b","c"),
start=as.Date(c("2015/10/13","2015/10/13","2015/10/11")),
end=as.Date(c("2015/10/16","2015/10/18","2015/10/15")))
df2<-data.frame(date=seq(as.Date("2015/10/11"), as.Date("2015/10/18"), "days"),
value=c(32,30,28,15,19,23,35,18))
> df1
sub start end
1 a 2015-10-13 2015-10-16
2 b 2015-10-13 2015-10-18
3 c 2015-10-11 2015-10-15
> df2
date value
1 2015-10-11 32
2 2015-10-12 30
3 2015-10-13 28
4 2015-10-14 15
5 2015-10-15 19
6 2015-10-16 23
7 2015-10-17 35
8 2015-10-18 18
I want to sum value
in df2 from start
date until end
date in df1, and I want to do it for each sub
(in the real problem there are many subjects (i.e. pairs of starting and ending dates), and so I thoght maybe I should use for
loop).
I expect something like this:
sub sum
a 85
b 138
c 124
where a = 28+15+19+23, b = 28+15+19+23+35+18 and c = 32+30+28+15+19
Any help is appreciated
An option with mapply
would be to create a sequence of dates between start
and end
dates in df1
and subset those dates from df2
and sum
the value
.
df1$sum <- mapply(function(x, y) sum(df2$value[df2$date %in% x:y]),
df1$start, df1$end)
df1[c(1, 4)]
# sub sum
#1 a 85
#2 b 138
#3 c 124