I am using R to analyze the 'momentum' of protest movements in Africa. To do so, I am analyzing individual protest events. I want to create a rolling measure of the rolling number (sum) of protests within a time period.
Most of the answers here on Stack Overflow deal with datasets where observations are at fixed intervals (one obs. per day or per month, etc.). But my data are 'ragged' in the sense that they occur in different intervals. Sometimes there is one day between observations. Other times there are two weeks.
A rolling sum of the number of protest events that have occurred in a given country over the past 10 days. This would be in the form of a variable that simply sums the number of events within the past ten days, inclusive of the current event.
Here is a reproducible set of data:
df1 <- data.frame(date = c("8/1/2019", "8/2/2019", "8/3/2019", "8/6/2019", "8/15/2019", "8/16/2019", "8/30/2019", "9/1/2019", "9/2/2019", "9/3/2019", "9/4/2019", "6/1/2019", "6/26/2019", "7/1/2019", "7/2/2019", "7/9/2019", "7/10/2019", "8/1/2019", "8/2/2019", "8/15/2019", "8/28/2019", "9/1/2019"),
country = c(rep("Algeria", 11), rep("Benin", 11)),
event = rep("Protest", 22))
date country event roll_sum
-------- ------- ------- --------
8/1/2019 Algeria Protest 1
8/2/2019 Algeria Protest 2
8/3/2019 Algeria Protest 3
8/6/2019 Algeria Protest 4
8/15/2019 Algeria Protest 2
8/16/2019 Algeria Protest 3
8/30/2019 Algeria Protest 1
9/1/2019 Algeria Protest 2
9/2/2019 Algeria Protest 3
9/3/2019 Algeria Protest 4
9/4/2019 Algeria Protest 5
6/1/2019 Benin Protest 1
6/26/2019 Benin Protest 1
7/1/2019 Benin Protest 2
7/2/2019 Benin Protest 3
7/9/2019 Benin Protest 3
7/10/2019 Benin Protest 4
8/1/2019 Benin Protest 1
8/2/2019 Benin Protest 2
8/15/2019 Benin Protest 1
8/28/2019 Benin Protest 1
9/1/2019 Benin Protest 2
This is all probably very simple, but I can't figure out how to do it. Thank you in advance!
A base R
approach,
df1$date <- as.Date(df1$date,"%m/%d/%Y")
vector <- vector()
for( j in unique(df1$country)) {
df2 <- df1[df1$country==j,]
for(i in 1:nrow(df2)) {
k <- nrow(df2[df2$date<= df2$date[i] & df2$date>=df2$date[i]-10 ,])
vector <- c(vector, k)
}
}
df1$roll_sum <- vector
gives,
date country event roll_sum
1 2019-08-01 Algeria Protest 1
2 2019-08-02 Algeria Protest 2
3 2019-08-03 Algeria Protest 3
4 2019-08-06 Algeria Protest 4
5 2019-08-15 Algeria Protest 2
6 2019-08-16 Algeria Protest 3
7 2019-08-30 Algeria Protest 1
8 2019-09-01 Algeria Protest 2
9 2019-09-02 Algeria Protest 3
10 2019-09-03 Algeria Protest 4
11 2019-09-04 Algeria Protest 5
12 2019-06-01 Benin Protest 1
13 2019-06-26 Benin Protest 1
14 2019-07-01 Benin Protest 2
15 2019-07-02 Benin Protest 3
16 2019-07-09 Benin Protest 3
17 2019-07-10 Benin Protest 4
18 2019-08-01 Benin Protest 1
19 2019-08-02 Benin Protest 2
20 2019-08-15 Benin Protest 1
21 2019-08-28 Benin Protest 1
22 2019-09-01 Benin Protest 2