Search code examples
rdatetimetime-serieszoorollapply

Rollapply in R time frame indexing


Let's say I have the following data frame of series.

library(zoo)
n=10
date = as.Date(1:n);date
y = rnorm(10);y
dat = data.frame(date,y)
dat

         date           y
1  1970-01-02 -0.02052313
2  1970-01-03  0.28255304
3  1970-01-04 -0.10718621
4  1970-01-05 -1.19299366
5  1970-01-06  1.17072468
6  1970-01-07  0.55849119
7  1970-01-08  0.30474050
8  1970-01-09 -0.30777180
9  1970-01-10 -0.01874367
10 1970-01-11 -0.74233556

Calculating the rolling standard deviation with width (rolling window) of 3 days,i do:

a = zoo::rollapply(dat$y,3,sd)

With results:

       2         3         4         5         6 
1.7924102 0.4189522 0.4599979 0.4164408 0.3786601 
        7         8         9 
0.9481849 0.7048868 0.2494578 

And finding the maximum standard deviation

max(a)
1.79241

Now I want to find out in which time 3-day interval this maximum refers to.How can I do that? Imagine that my series is 20 years, so I want to find the maximum standard deviation of this 20 years and extract the specific 3 day time interval.


Solution

  • Updated in response to OPs clarification of working days to include a date vector which misses out weekends, in a nominal manner (I've not checked if these dates are real weekends, the purpose is to miss out dates for weekends).

    set.seed(123)
    
    n = 10
    
    date = as.Date(c(1, 4:8, 11:14), origin = "1970-01-01")
    
    y = rnorm(10)
    
    dat = data.frame(date ,y)
    
    roll_win <- 3
    
    dat$a = c(rep(NA_real_, roll_win - 1), zoo::rollapply(dat$y, roll_win ,sd))
    
    dat <- subset(dat, !is.na(a))
    
    dat_max <- dat[dat$a == max(dat$a, na.rm = TRUE), ]
    
    dat_max$date_start <- dat$date[which(dat$a == max(dat$a, na.rm = TRUE)) - (roll_win - 1)]
    
    dat_max
    #>         date         y        a date_start
    #> 8 1970-01-13 -1.265061 1.496275 1970-01-09
    

    Created on 2022-02-22 by the reprex package (v2.0.1)