Search code examples
rdataframedplyrrollapply

How can I use rollapply after group by in a whole data frame in R?


I have a data frame that looks like this:

library(tidyverse)
library(zoo)
date = c(rep(as.Date("2022/1/1"),5),rep(as.Date("2022/1/12"),5)) 
a = seq(1,10,1)
b = seq(-1,-10,-1)
c = seq(10,-20,length.out=10)
window=c(rep(2,5),rep(3,5))
A = tibble(date,a,b,c,window);A

   date           a     b       c window
   <date>     <dbl> <dbl>   <dbl>  <dbl>
 1 2022-01-01     1    -1  10          2
 2 2022-01-01     2    -2   6.667      2
 3 2022-01-01     3    -3   3.333      2
 4 2022-01-01     4    -4   0          2
 5 2022-01-01     5    -5  -3.333      2
 6 2022-01-12     6    -6  -6.667      3
 7 2022-01-12     7    -7 -10          3
 8 2022-01-12     8    -8 -13.33       3
 9 2022-01-12     9    -9 -16.67       3
10 2022-01-12    10   -10 -20          3

I want to group by date the columns a,b,c and apply a rolling window of width equal the window value for this date in the all the columns a,b,c and calculate the minimum of them.

For example, for the date 2022-01-01 with window 2 it will search for the minimum in the first row (which will be -1) then will move downwards by 1 so will search the minimum in the first and second row that will be again -2,then into second and third that will be -3, to third and fourth that will be -4,then the fourth and fifth row that will be -5, and last the fifth row that will be -5 again.

The same procedure for the date 2022-01-12 with width 3 and will be, -10,-13,33,-16,67,-20 and -20.

Ideally I want the resulted data frame to be the minimum of all minimums grouped by - summarized by date:

date MINS
2022-01-01 -5
2022-01-12 -20

I think that rollapply function for whole matrices can be done with by.column = FALSE

So I tried :

A%>%group_by(date)%>%
  summarise(Vectors = rollapply(A[,2:4],width = window,min,by=1,by.column=FALSE))

but it doesn't work.

Any help ?


Solution

  • To get the final result you said you want, you might as well skip the rolling windows altogether:

    A %>% 
      group_by(date) %>% 
      summarise(min = min(a, b, c))
    #> # A tibble: 2 x 2
    #>   date         min
    #>   <date>     <dbl>
    #> 1 2022-01-01    -5
    #> 2 2022-01-12   -20
    

    If you do want the rolling minimums: You’re on the right track, but you need to use the grouped data (don't reference A inside the summarise()), and set align and partial:

    A %>% 
      group_by(date) %>% 
      mutate(
        min = zoo::rollapply(
          cbind(a, b, c),
          window,
          min,
          align = "left",
          partial = TRUE,
          by.column = FALSE
        )
      )
    #> # A tibble: 10 x 6
    #> # Groups:   date [2]
    #>    date           a     b      c window   min
    #>    <date>     <dbl> <dbl>  <dbl>  <dbl> <dbl>
    #>  1 2022-01-01     1    -1  10         2  -2  
    #>  2 2022-01-01     2    -2   6.67      2  -3  
    #>  3 2022-01-01     3    -3   3.33      2  -4  
    #>  4 2022-01-01     4    -4   0         2  -5  
    #>  5 2022-01-01     5    -5  -3.33      2  -5  
    #>  6 2022-01-12     6    -6  -6.67      3 -13.3
    #>  7 2022-01-12     7    -7 -10         3 -16.7
    #>  8 2022-01-12     8    -8 -13.3       3 -20  
    #>  9 2022-01-12     9    -9 -16.7       3 -20  
    #> 10 2022-01-12    10   -10 -20         3 -20