Search code examples
rrollapply

Using Rollapply in a data table with a variable width in R


I want to use rollapply to a calculate a moving average in a data table in R. Each row in the data table is one month and I would like to average the last 36 months but will go as low as 24 if that is all that is available.

The code below calculates the 36 month moving average of "Points" ignoring NAs for each "username". Then it counts how many non-NAs there are and if it is less than 24 it sets the moving average to NA.

dt[,MovAvg:=rollapply(Points,width=36,FUN=mean,fill=NA,alight="right",na.rm=TRUE),by="username"]
dt[,NACount:=rollapply(Points,width=36,FUN=function(x) sum(!is.na(x)),fill=NA,alight="right"),by="username"]
dt[,MovAvg:=ifelse(NACount>=24,MovAvg,NA)]

This code works fine except for in the beginning for rows 1-35. I would like it to calculate the moving average with the data available as long as there are at least 24 data points. Rollapply starts on row 36 because that is how I set the width.

Is there a way to do this in a data table? Thank you


Solution

  • For rolling mean and sum there are dedicated functions in zoo (rollmean and rollsum). If you want to apply the function for atleast 24 datapoints you can use the partial argument which is available only in rollapply. Also rollapply(...., align = 'right') is same as rollapplyr. You can try :

    library(data.table)
    library(zoo)
    
    dt[, MovAvg := rollapplyr(Points,36,mean,na.rm = TRUE, partial = 24,fill = NA),username]