I have a stock dataset in xts format that has data for one year at a minute by minute level. I need to calculate the returns, but only for the periods on the same date. what would be the most efficient way to not calculate the returns while avoiding the returns of todays first observation and yesterdays last observation.
In the same vein, if one wants to calculate - 15 minute returns, then how to avoid calculating the periods that are not in the same day?
I am including a toy dataset with hourly periods (since original dataset is at minute level)
time_index <- seq(from = as.POSIXct("2021-01-01 07:00"), to = as.POSIXct("2021-02-28 18:00"), by = "hour")
set.seed(1)
value <- 100 + rnorm(n = length(time_index))
eventdata <- xts(value, order.by = time_index)
So how to calculate three hourly returns for intraday periods.
It is probably easier to compute all the returns and then drop the overnight-ones. For getting 15-minute returns you can create a time-sequence with these intervals and then run the computations on the reduced data.
library(xts)
library(lubridate)
library(TTR)
time_index <-
seq(
from = as.POSIXct("2021-01-01 07:00"),
to = as.POSIXct("2021-02-28 18:00"),
by = "hour"
)
set.seed(1)
value <- 100 + rnorm(n = length(time_index))
eventdata <- xts(value, order.by = time_index)
# compute hourly returns and drop first observation per day
res <- TTR::ROC(eventdata, type = "discrete")
res2 <- res[-xts:::startof(res, by = "days")]
> head(res2)
[,1]
2021-01-01 08:00:00 0.00815204
2021-01-01 09:00:00 -0.01017404
2021-01-01 10:00:00 0.02451394
2021-01-01 11:00:00 -0.01245897
2021-01-01 12:00:00 -0.01146199
2021-01-01 13:00:00 0.01318717
# compute 15-minute returns
time_index2 <- seq(
from = as.POSIXct("2021-01-01 07:00"),
to = as.POSIXct("2021-01-01 18:00"),
by = "min"
)
length(time_index2)
time_index3 <- seq(ymd_hms(from = '2021-01-01 07:00:00'),
by = '15 min', length.out=(661))
testasset <- xts(rnorm(661, sd = 0.03), order.by = time_index2)
res3 <- TTR::ROC(testasset[time_index3], type = "discrete")
head(res3)
> head(res3)
[,1]
2021-01-01 08:00:00 NA
2021-01-01 08:15:00 -0.6516079
2021-01-01 08:30:00 -5.7101543
2021-01-01 08:45:00 -0.5411609
2021-01-01 09:00:00 -2.2945892
2021-01-01 09:15:00 -2.3038205