Search code examples
rslidedplyr

Rolling window with slide_dbl() on grouped data


This is an extension to following question: Rolling window slider::slide() with grouped data

I want to mutate a column of my grouped tibble with slide_dbl(), i.e. applying slide_dbl() on all groups, but only within them, not across them.

When running the solution of linked question I receive following error message:

Error: Problem with `mutate()` input `rollreg`.
x Inapplicable method for 'mutate_' applied to object of class "c('double', 'numeric')".

My tibble has following structure:

tibble [450,343 x 3] (S3: grouped_df/tbl_df/tbl/data.frame)
 $ company: num [1:450343] 1 1 1 1 1 ...
 $ date: Date[1:450343], format: "2011-11-30" "2011-12-31" "2012-01-31" "2012-02-29" ...
 $ result: num [1:450343] NA NA NA 12.5981 -2.9023 ...
 - attr(*, "groups")= tibble [3,339 x 2] (S3: tbl_df/tbl/data.frame)
  ..$ company: num [1:3339] 1 2 3 4 5 ...
  ..$ .rows : list<int> [1:3339] 

To complete, this is the code I ran according to the linked solution:

testtest <- data %>%
  group_by(company) %>% nest() %>%
  mutate(rollreg = map(data, ~ .x %>% mutate(result_2 = slide_dbl(.x = .$result, .f = ~prod(1+.)-1, .before = 11, .after = -1, complete=TRUE)))) %>%
  select(-data) %>% unnest(rollreg)

Here, above mentioned error message occurs. I guess it's because of the data structure. Yet, I can't figure any solution (also not with similar functions like group_map() or group_modify()). Can anyone help? Thanks in advance!


Solution

  • An option is group_split by the grouping column (in the example, using 'case', loop over the list of datasets with map, create new column in mutate by applying the slide_dbl

    library(dplyr)
    library(tidyr)
    library(purrr)
    data %>% 
       group_split(case) %>%
       map_dfr(~ .x %>% 
          mutate(out = slide_dbl(r1, .f = ~ prod(1 + .x) - 1, 
              .before = 5, .after = -1, complete = TRUE)))
    

    -output

    # A tibble: 30 x 6
    #       t case       r1      r2     r3    out
    #   <int> <chr>   <dbl>   <dbl>  <dbl>  <dbl>
    # 1     1 a     -0.294  -0.164   1.33   0    
    # 2     2 a      0.761   1.01    0.115 -0.294
    # 3     3 a     -0.781  -0.499   0.290  0.243
    # 4     4 a     -0.0732 -0.110   0.289 -0.728
    # 5     5 a     -0.528   0.707   0.181 -0.748
    # 6     6 a     -1.35   -0.411  -1.47  -0.881
    # 7     7 a     -0.397  -1.28    0.172 -1.06 
    # 8     8 a      1.68    0.956  -2.81  -1.02 
    # 9     9 a     -0.0167 -0.0727 -1.08  -1.24 
    #10    10 a      1.25   -0.326   1.61  -1.26 
    ## … with 20 more rows
    

    Or if we need to use the nest_by, it creates an attribute rowwise, so, it is better to ungroup before applying

    out1 <- data %>%
        select(-t) %>% 
        nest_by(case) %>%
        ungroup %>%
        mutate(data = map(data, ~ .x %>% 
                 mutate(out = slide_dbl(r1, .f = ~ prod(1 + .x) - 1, 
             .before = 5, .after = -1, complete = TRUE))))
    

    -output

    out1
    # A tibble: 3 x 2
    #  case  data             
    #  <chr> <list>           
    #1 a     <tibble [10 × 4]>
    #2 b     <tibble [10 × 4]>
    #3 c     <tibble [10 × 4]>   
    

    Now, we unnest the structure

     out1 %>% 
        unnest(data)
    # A tibble: 30 x 5
    #   case       r1      r2     r3    out
    #   <chr>   <dbl>   <dbl>  <dbl>  <dbl>
    # 1 a     -0.294  -0.164   1.33   0    
    # 2 a      0.761   1.01    0.115 -0.294
    # 3 a     -0.781  -0.499   0.290  0.243
    # 4 a     -0.0732 -0.110   0.289 -0.728
    # 5 a     -0.528   0.707   0.181 -0.748
    # 6 a     -1.35   -0.411  -1.47  -0.881
    # 7 a     -0.397  -1.28    0.172 -1.06 
    # 8 a      1.68    0.956  -2.81  -1.02 
    # 9 a     -0.0167 -0.0727 -1.08  -1.24 
    #10 a      1.25   -0.326   1.61  -1.26 
    # … with 20 more rows
    

    data

    data <- tibble(t = rep(1:10, 3), 
                   case = c(rep("a", 10), rep("b", 10), rep("c", 10)),
                   r1 = rnorm(30),
                   r2 = rnorm(30),
                   r3 = rnorm(30))