Im using this as an example (Calculating moving average) which I have successfully incorporated into my code. I need to calculate rolling mean and rolling median (which I have done) but my data set is enormous and I need to add a secondary variable to filter this by. In the example below, they calculate rolling mean for a data set of 10 days. What happens if they have 10 days for different locations, and we need to calculate the rolling means for 10 days based on these different location?
library(tidyverse)
library(zoo)
some_data = tibble(day = 1:10)
# cma = centered moving average
# tma = trailing moving average
some_data = some_data %>%
mutate(roll_mean = rollmean(day, k = 3, fill = NA)) %>%
mutate(roll_median = rollmedian(day, k = 3, fill = NA, align = "right"))
some_data
You can group by location :
library(tidyverse)
library(zoo)
some_data <- rbind(tibble(day = 1:5,location = c(rep("A",5))),
tibble(day = 1:5,location = c(rep("B",5))))
some_data <- some_data %>% group_by(location) %>%
mutate(roll_mean_left = rollmean(day, k = 3, fill = NA, align='left'),
roll_mean_right = rollmean(day, k = 3, fill = NA, align='center'),
roll_median_center = rollmedian(day, k = 3, fill = NA, align = 'right'))
some_data
The roll function reinitializes for each location.
Note how the rolling window moves according to the align
parameter:
day location roll_mean_left roll_mean_right roll_median_center
<int> <chr> <dbl> <dbl> <dbl>
1 1 A 2 NA NA
2 2 A 3 2 NA
3 3 A 4 3 2
4 4 A NA 4 3
5 5 A NA NA 4
6 1 B 2 NA NA
7 2 B 3 2 NA
8 3 B 4 3 2
9 4 B NA 4 3
10 5 B NA NA 4