Search code examples
rrollapply

Rolling sum in R


df <- data.frame(x = seq(1:10))

I want this:

df$y <- c(1, 2, 3, 4, 5, 15, 20 , 25, 30, 35)

i.e. each y is the sum of previous five x values. This implies the first five y will be same as x

What I get is this:

df$y1 <- c(df$x[1:4], RcppRoll::roll_sum(df$x, 5)) 

  x  y y1
  1  1  1
  2  2  2
  3  3  3
  4  4  4
  5  5 15
  6 15 20
  7 20 25
  8 25 30
  9 30 35
  10 35 40

In summary, I need y but I am only able to achieve y1


Solution

  • 1) enhanced sum function Define a function Sum which sums its first 5 values if it receives 6 values and returns the last value otherwise. Then use it with partial=TRUE in rollapplyr:

    Sum <- function(x) if (length(x) < 6) tail(x, 1) else sum(head(x, -1))
    rollapplyr(x, 6, Sum, partial = TRUE)
    ##  [1]  1  2  3  4  5 15 20 25 30 35
    

    2) sum 6 and subtract off original Another possibility is to take the running sum of 6 elements filling in the first 5 elements with NA and subtracting off the original vector. Finally fill in the first 5.

    replace(rollsumr(x, 6, fill = NA) - x, 1:5, head(x, 5))
    ##  [1]  1  2  3  4  5 15 20 25 30 35
    

    3) specify offsets A third possibility is to use the offset form of width to specify the prior 5 elements:

    c(head(x, 5), rollapplyr(x, list(-(1:5)), sum))
    ## [1]  1  2  3  4  5 15 20 25 30 35
    

    4) alternative specification of offsets In this alternative we specify an offset of 0 for each of the first 5 elements and offsets of -(1:5) for the rest.

    width <- replace(rep(list(-(1:5)), length(x)), 1:5, list(0))
    rollapply(x, width, sum)
    ## [1]  1  2  3  4  5 15 20 25 30 35
    

    Note

    The scheme for filling in the first 5 elements seems quite unusual and you might consider using partial sums for the first 5 with NA or 0 for the first one since there are no prior elements fir that one:

    rollapplyr(x, list(-(1:5)), sum, partial = TRUE, fill = NA)
    ## [1] NA  1  3  6 10 15 20 25 30 35
    
    rollapplyr(x, list(-(1:5)), sum, partial = TRUE, fill = 0)
    ## [1]  0  1  3  6 10 15 20 25 30 35
    
    rollapplyr(x, 6, sum, partial = TRUE) - x
    ## [1]  0  1  3  6 10 15 20 25 30 35