Search code examples
rdplyrmeanzoo

Rolling mean across day of year


In my example below, I can calculate a centered 7-day rolling mean however, the first three days and the last three days are NA values. The rolling mean should take into account that day 365 is followed by day 1 and use that in the rolling mean calculation. How can I calculate a rolling 7-day mean so that there are no NA values?

library(tidyverse)
library(zoo)

set.seed(321)

aa <- data.frame(
  doy = seq(1,365,1),
  value = round(rnorm(365,30,5))
)

bb <- aa %>%
  mutate(movingAVG = round(rollmean(value, k = 7, align = 'center', fill = NA)))

head(bb)
#>   doy value movingAVG
#> 1   1    39        NA
#> 2   2    26        NA
#> 3   3    29        NA
#> 4   4    29        31
#> 5   5    29        30
#> 6   6    31        31

tail(bb)
#>     doy value movingAVG
#> 360 360    24        30
#> 361 361    38        29
#> 362 362    30        29
#> 363 363    20        NA
#> 364 364    26        NA
#> 365 365    29        NA

Created on 2023-11-29 with reprex v2.0.2


Solution

  • One potential option is to replicate your "aa" dataframe three times (e.g. 1-365 + 1-365 + 1-365), calculate your rolling mean for all values, then filter the middle "aa" dataframe (i.e. 1-365 + 1-365 + 1-365), e.g.

    library(tidyverse)
    library(zoo)
    
    set.seed(321)
    
    aa <- data.frame(
      doy = seq(1,365,1),
      value = round(rnorm(365,30,5))
    )
    
    bb <- aa %>%
      bind_rows(aa, .id = "index") %>%
      bind_rows(aa) %>%
      mutate(movingAVG = round(rollmean(value, k = 7, align = 'center', fill = NA))) %>%
      filter(index == 2) %>%
      select(-index)
    
    head(bb)
    #>   doy value movingAVG
    #> 1   1    39        28
    #> 2   2    26        30
    #> 3   3    29        30
    #> 4   4    29        31
    #> 5   5    29        30
    #> 6   6    31        31
    tail(bb)
    #>     doy value movingAVG
    #> 360 360    24        30
    #> 361 361    38        29
    #> 362 362    30        29
    #> 363 363    20        29
    #> 364 364    26        30
    #> 365 365    29        28
    

    Created on 2023-11-30 with reprex v2.0.2

    Does that make sense?