Search code examples
rsplitsubsetxtsquantitative-finance

Sum xts elements on a list row by row


I have an xts object called data which contains 5 min returns for the period from 2015-01-01 17:00:00 to 2015-12-31 17:00:00. Each trading day starts at 17:00:00 and finishes the next day at the same time for a total of 288 daily returns[(24hours*60 minutes) / 5 minutes = 288 intraday returns]. The returns are denoted as

head(data, 5)
                          DPRICE
2015-01-01 17:00:00 0.000000e+00
2015-01-01 17:05:00 9.797714e-05
2015-01-01 17:10:00 2.027022e-04
2015-01-01 17:15:00 2.735798e-04
2015-01-01 17:20:00 7.768653e-05

tail(data, 5)
                          DPRICE
2015-12-31 16:40:00 0.0001239429
2015-12-31 16:45:00 0.0001272704
2015-12-31 16:50:00 0.0010186764
2015-12-31 16:55:00 0.0006841370
2015-12-31 17:00:00 0.0002481227

I am trying to standardize the data by their average absolute value for each 5-minute intra-day interval according to McMillan and Speight Daily FX Volatility Forecasts (2012).

The mathematical formula is : enter image description here

My *code is

library(xts)
std_data = abs(data) #create absolute returns
D <- split(std_data, "days") #splits data to days
mts.days <- lapply(seq_along(D) - 1, function(i) {
  if (i > 0) rbind(D[[i]]["T17:00:00/T23:55:00"], D[[i + 1]]["T00:00:00/T16:55:00"])
}) #creates a list with 365 elements each containing 288 unique returns
dummy = mapply(sum, mts.days) #add the first,second... observations from each element

With this code I create a list with 365 xts elements each having dimensions

> dim(mts.days[[2]])
[1] 288   1 

I want to add the same observations from each element to create the denominator of the function above.


Solution

  • I don't understand your request, but will give it a shot nevertheless.

    ## generate bogus data
    library(quantmod)
    set.seed(123)
    ndays <- 3
    ndatperday <- 288
    data <- cumsum(do.call("rbind", lapply(13:15, function(dd){
        xts(rnorm(ndatperday)/1e4,
            seq(as.POSIXct(paste0("2016-08-",dd," 17:00:00")),
                length = ndatperday, by = 300))
    
    })))
    colnames(data) <- "DPRICE"
    
    ## calculate percentage returns
    ret <- ROC(data, type="discrete")
    
    ## this is probably not what you need: returns divided by the overall mean
    ret/mean(abs(ret), na.rm=T)
    
    ## I suspect indeed that you need returns divided by the daily mean return
    library(dplyr)
    ret.df <- data.frame(ret)
    ## create a factor identifying the 3 days of bogus data
    ret.df$day <- rep(paste0("2016-08-",13:15),each=ndatperday)
    ## compute daily mean return
    dail <- ret.df %>%
        group_by(day) %>%
        summarise(mean=mean(abs(DPRICE), na.rm=TRUE))
    ## attach daily mean returns to the days they actually are associated to
    ret.df <- ret.df %>% left_join(dail)
    ## normalize
    ret.df$DPRICE <- ret.df$DPRICE/ret.df$mean
    

    %%%%%%%%%

    Second shot: after reading the paper (http://onlinelibrary.wiley.com/doi/10.1002/for.1222/full) I might have understood what you were after:

    library(quantmod)
    library(dplyr)
    set.seed(123)
    
    ## generate bogus 5-min series
    ndays <- 365
    ndatperday <- 288
    data <- as.xts(zoo(0.1+cumsum(rt(ndays*ndatperday, df=3))/1e4,
                      seq(as.POSIXct("2015-01-01 17:00"),
                          as.POSIXct("2015-12-31 17:00"), by=300)))
    colnames(data) <- "DPRICE"
    
    ## calculate 5-min percentage returns
    ret <- ROC(data, type="discrete")
    
    ## create a factor identifying the 5-minute intra-day interval
    ret.df <- as.data.frame(ret)
    ret.df$intra5 <- strftime(index(ret), format="%H:%M")
    
    ## compute mean returns (over the year) for each of the 288 5-minute intra-day intervals
    dail <- ret.df %>%
        group_by(intra5) %>%
        summarise(mean=mean(abs(DPRICE), na.rm=TRUE))
    
    ## attach mean returns to each datapoint
    ret.df <- ret.df %>% left_join(dail)
    
    ## normalize
    ret.df$DPRICE <- ret.df$DPRICE/ret.df$mean