Could someone kindly advise how best to approach making a new column in a dataframe, where each observation is an average/mean of the previous 12 observations (excluding the current observation). I have failed so far to find a similar answer on here so this would be greatly appreciated!
My data.frame:
LateCounts <-
Date Count
1 Jan-19 7
2 Feb-19 4
3 Mar-19 9
4 Apr-19 8
5 May-19 7
6 Jun-19 4
7 Jul-19 4
8 Aug-19 5
9 Sep-19 2
10 Oct-19 5
11 Nov-19 7
12 Dec-19 4
13 Jan-20 3
14 Feb-20 4
15 Mar-20 5
16 Apr-20 2
17 May-20 3
18 Jun-20 2
19 Jul-20 3
20 Aug-20 4
21 Sep-20 3
22 Oct-20 2
I am currently using the following code:
LateCounts <- LateCounts %>% mutate(RollAvge=rollapplyr(Count, 12, mean, partial = TRUE))
This yields the following but the 12 month rolling average:
Date Count RollAvge
1 Jan-19 7 7
2 Feb-19 4 5.5
3 Mar-19 9 6.666667
4 Apr-19 8 7
5 May-19 7 7
6 Jun-19 4 6.5
7 Jul-19 4 6.142857
8 Aug-19 5 6
9 Sep-19 2 5.555556
10 Oct-19 5 5.5
11 Nov-19 7 5.636364
12 Dec-19 4 5.5
13 Jan-20 3 5.166667
14 Feb-20 4 5.166667
15 Mar-20 5 4.833333
16 Apr-20 2 4.333333
17 May-20 3 4
18 Jun-20 2 3.833333
19 Jul-20 3 3.75
20 Aug-20 4 3.666667
21 Sep-20 3 3.75
22 Oct-20 2 3.5
What i actually need to achieve is the below. This is 12 month trailing or rolling average (where the values in the 'RollAvge' column are averages/means of the previous values in 'Count' column - not including the current month.
Date Count RollAvge
1 Jan-19 7
2 Feb-19 4 7
3 Mar-19 9 5.5
4 Apr-19 8 6.666667
5 May-19 7 7
6 Jun-19 4 7
7 Jul-19 4 6.5
8 Aug-19 5 6.142857
9 Sep-19 2 6
10 Oct-19 5 5.555556
11 Nov-19 7 5.5
12 Dec-19 4 5.636364
13 Jan-20 3 5.5
14 Feb-20 4 5.166667
15 Mar-20 5 5.166667
16 Apr-20 2 4.833333
17 May-20 3 4.333333
18 Jun-20 2 4
19 Jul-20 3 3.833333
20 Aug-20 4 3.75
21 Sep-20 3 3.666667
22 Oct-20 2 3.755556
Thanks,
We need to take the lag
of the output derived from rollapply
.
library(dplyr)
library(zoo)
LateCounts %>%
mutate(RollAvge= lag(rollapplyr(Count, 12, mean, partial = TRUE)))
-output
# Date Count RollAvge
#1 Jan-19 7 NA
#2 Feb-19 4 7.000000
#3 Mar-19 9 5.500000
#4 Apr-19 8 6.666667
#5 May-19 7 7.000000
#6 Jun-19 4 7.000000
#7 Jul-19 4 6.500000
#8 Aug-19 5 6.142857
#9 Sep-19 2 6.000000
#10 Oct-19 5 5.555556
#11 Nov-19 7 5.500000
#12 Dec-19 4 5.636364
#13 Jan-20 3 5.500000
#14 Feb-20 4 5.166667
#15 Mar-20 5 5.166667
#16 Apr-20 2 4.833333
#17 May-20 3 4.333333
#18 Jun-20 2 4.000000
#19 Jul-20 3 3.833333
#20 Aug-20 4 3.750000
#21 Sep-20 3 3.666667
#22 Oct-20 2 3.750000
LateCounts <- structure(list(Date = c("Jan-19", "Feb-19", "Mar-19", "Apr-19",
"May-19", "Jun-19", "Jul-19", "Aug-19", "Sep-19", "Oct-19", "Nov-19",
"Dec-19", "Jan-20", "Feb-20", "Mar-20", "Apr-20", "May-20", "Jun-20",
"Jul-20", "Aug-20", "Sep-20", "Oct-20"), Count = c(7L, 4L, 9L,
8L, 7L, 4L, 4L, 5L, 2L, 5L, 7L, 4L, 3L, 4L, 5L, 2L, 3L, 2L, 3L,
4L, 3L, 2L)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18", "19", "20", "21", "22"))