Search code examples
rggplot2rstudiorollapply

How to calculate the running mean starting from the 4th column of "datamean" (given the width=4) and have first three columns as "NA"?


How to calculate the running mean starting from the 4th column of "datamean" (since the width=4) and have first and last three columns as "NA" or empty?

 require(zoo)    
 data <- zoo(seq(1:10))
 datamean <- rollapply(data, width=4, by=1, FUN=mean, align="left")
 cbind(data, datamean)

Currently the output is this:

    data datamean
1     1      2.5
2     2      3.5
3     3      4.5
4     4      5.5
5     5      6.5
6     6      7.5
7     7      8.5
8     8       NA
9     9       NA
10   10       NA

However I want:

    data datamean
1     1      NA
2     2      NA
3     3      NA
4     4      2.5
5     5      3.5
6     6      4.5
7     7      5.5
8     8      NA
9     9      NA
10   10      NA

Solution

  • We can calculate the rolling mean first, and then manipulate the datamean column later. mutate and ifelse can examine a certain row number (in this case, the last three) and replace those numbers to NA. dt2 is the final output.

    library(dplyr)
    require(zoo)
    
    dt <- data_frame(data = zoo(seq(1:10)))
    
    dt2 <- dt %>%
      mutate(datamean = rollmean(data, k = 4, fill = NA, align = "right")) %>%
      mutate(datamean = ifelse(row_number() %in% n():(n() - 2), NA, datamean))
    
    dt2
    # A tibble: 10 x 2
            data datamean
       <S3: zoo>    <dbl>
     1         1       NA
     2         2       NA
     3         3       NA
     4         4      2.5
     5         5      3.5
     6         6      4.5
     7         7      5.5
     8         8       NA
     9         9       NA
    10        10       NA