Search code examples
rtimetime-seriessummarymoving-average

Rolling average by time period rather than observation in R


I have a dataset with dates occurring randomly. For example:

10/21/15, 11/21/15, 11/22/15, 11/28/15,11/30/15, 12/12/15...etc

I am looking to create a rolling average by time-period NOT by at the observation level. For instance if I wanted to do a moving average of the last 7 days. I would not want to look up at the last 7 rows, but rather the last 7 days

For a tiny example:

 dates = c('2015-08-07', '2015-08-08','2015-08-09','2015-09-09','2015-10-10')
 value = c(5,10,5,3,2)
 df=data.frame(dates, value)
 df$desired = c(NA,5,7.5, NA,NA)

I am obviously looking to do this for much larger dataset, but I hope you get the idea. If I was to use 7 days for example this is the result I would expect.

Notice that I don't include the current observations value into the rolling average, only the previous. I want rolling average by time period, not observation row number.

I tried looking at rollmean and dplyr but I couldnt figure it out. I don't really care how it happens though.

Thanks!


Solution

  • try this:

    rollavgbyperiod <- function(i,window){
      startdate <- dates[i]-window
      enddate <- dates[i]-1
      interval <- seq(startdate,enddate,1)
    
      tmp <- value[dates %in% interval]
      return(mean(tmp))
    }
    
    dates <- as.Date(dates)
    window <- 7
    res <- sapply(1:length(dates),function(m) rollavgbyperiod(m,window))
    res[is.nan(res)] <- NA
    
    > data.frame(dates,value,res)
           dates value res
    1 2015-08-07     5 NA
    2 2015-08-08    10 5.0
    3 2015-08-09     5 7.5
    4 2015-09-09     3 NA
    5 2015-10-10     2 NA