Search code examples
rzoorollapply

How to modify window width in rollapplyr() function based on column names in R


I have a data set that contains several variables of interest. What I'd like to do is compute a rolling average with the rollapplyr() function from the zoo package for each of them as additional columns, but the caveat is I'd like to use a different window width (width = ) depending on the column. I also wish to avoid pivoting my data to long format.

library(tidyverse)
library(zoo)

rolling_f <- function(x, wdw, fun){ rollapplyr(x, wdw, fun, partial = TRUE) }

set.seed(1)
ex <- data.frame(
  var1 = rnorm(20, 8000, 500),
  var2 = rnorm(20, 8000, 500),
  var3 = rnorm(20, 8000, 500),
  var4 = rnorm(20, 8000, 500),
  var5 = rnorm(20, 8000, 500),
  var6 = rnorm(20, 8000, 500),
  var7 = rnorm(20, 8000, 500),
  var8 = rnorm(20, 8000, 500)
  ) %>%
  mutate(
    across(
      .cols = starts_with("var"),
      .fns = ~rolling_f(.x, 5, mean),
      .names = "{.col}_roll"
      )
    )

My code so far computes the same window width (5) for every column, but I'm hoping someone may be able to help me define my custom function further so the window can be 3 for var1:var3, 5 for var4:var5, and 6 for var6:var8, as an example. I assume this will simply require some additional code in the custom rolling_f() function I have started.


Solution

  • You could use case_match()/case_when() against the current column name in across to determine the window size:

    library(dplyr)
    library(readr)
    library(zoo)
    
    rolling_f <- function(x, wdw, fun){ rollapplyr(x, wdw, fun, partial = TRUE) }
    
    # case_match()
    ex %>%
      mutate(across(
        .cols = starts_with("var"),
        .fns = ~ rolling_f(.x, case_match(
          parse_number(cur_column()), 1:3 ~ 3, 4:5 ~ 5, 6:8 ~ 6
        ), mean),
        .names = "{.col}_roll"
      ))
    
    # case_when()
    ex %>%
      mutate(across(
        .cols = starts_with("var"),
        .fns = ~ rolling_f(
          .x,
          case_when(
            cur_column() %in% paste0("var", 1:3) ~ 3,
            cur_column() %in% paste0("var", 4:5) ~ 5,
            cur_column() %in% paste0("var", 6:8) ~ 6
          ),
          mean
        ),
        .names = "{.col}_roll"
      ))