I have daily data for 7 years. I want to group this into weekly data (based on the actual date) and sum the frequency.
Date Frequency
1 2014-01-01 179
2 2014-01-02 82
3 2014-01-03 89
4 2014-01-04 109
5 2014-01-05 90
6 2014-01-06 66
7 2014-01-07 75
8 2014-01-08 106
9 2014-01-09 89
10 2014-01-10 82
What is the best way to achieve that? Thank you
These solutions all use base R and differ only in the definition and labelling of weeks.
1) cut
the dates into weeks and then aggregate over those. Weeks start on Monday but you can add start.on.monday=FALSE
to cut
to start them on Sunday if you prefer.
Week <- as.Date(cut(DF$Date, "week"))
aggregate(Frequency ~ Week, DF, sum)
## Week Frequency
## 1 2013-12-30 549
## 2 2014-01-06 418
2) If you prefer to define a week as 7 days starting with DF$Date[1] and label them according to the first date in that week then use this. (Add 6 to Week
if you prefer the last date in the week.)
weekno <- as.numeric(DF$Date - DF$Date[1]) %/% 7
Week <- DF$Date[1] + 7 * weekno
aggregate(Frequency ~ Week, DF, sum)
## Week Frequency
## 1 2014-01-01 690
## 2 2014-01-08 277
3) or if you prefer to label it with the first date existing in DF
in that week then use this. This and the last Week definition give the same result if there are no missing dates as is the case here. (If you want the last existing date in the week rather than the first then replace match
with findInterval
.)
weekno <- as.numeric(DF$Date - DF$Date[1]) %/% 7
Week <- DF$Date[match(weekno, weekno)]
aggregate(Frequency ~ Week, DF, sum)
## Week Frequency
## 1 2014-01-01 690
## 2 2014-01-08 277
The input in reproducible form is assumed to be:
Lines <- "Date Frequency
1 2014-01-01 179
2 2014-01-02 82
3 2014-01-03 89
4 2014-01-04 109
5 2014-01-05 90
6 2014-01-06 66
7 2014-01-07 75
8 2014-01-08 106
9 2014-01-09 89
10 2014-01-10 82"
DF <- read.table(text = Lines)
DF$Date <- as.Date(DF$Date)