I want to do an out-of-sample forecast experiment using the auto.arima
function. Further, time series cross validation with a fixed rolling window size should be applied. The goal is to obtain one step forecasts for 1,3 and 6 steps ahead.
library(forecast)
library(tseries)
#the time series
y1 = 2+ 0.15*(1:20) + rnorm(20,2)
y2 = y1[20]+ 0.3*(1:30) + rnorm(30,2)
y = as.ts(c(y1,y2))
#10obs in test set, 40obs in training set
ntest <- 10
ntrain <- length(y)-ntest
#auto.arima with some prefered specifications
farima <- function(x,h){forecast(auto.arima(x,ic="aic",test=c("adf"),seasonal=FALSE,
stepwise=FALSE, approximation = FALSE,
method=c("ML")),h=h)}
# executing the following function, gives the forecast errors in a matrix for each one-step forecast
e <- tsCV(y,farima,h = 6,window=40)
The predicted values are given by subtracting the error from the true value:
#predicted values
fc1 <- c(NA,y[2:50]-e[1:49,1])
fc1 <- fc1[41:50]
fc3 <- c(NA,y[2:50]-e[1:49,3])
fc3 <- fc3[41:50]
fc6 <- c(NA,y[2:50]-e[1:49,6])
fc6 <- fc6[41:50]
However I´m curious whether the predicted values for the 3-step ahead are coded correctly. Since the first 3-step ahead forecast is the prediction of the 43th observation?
Also i dont understand why the matrix e
for the 3-step ahead error [3th column] has a value for observation 40. Since i thought the first 3-step ahead forecast is obtained for observation 43 and thus there shouldnt be an error for observation 40.
Always read the help file:
Value
Numerical time series object containing the forecast errors as a vector (if h=1) and a matrix otherwise. The time index corresponds to the last period of the training data. The columns correspond to the forecast horizons.
So tsCV()
returns errors in a matrix where the (i,j)th entry contains the error for forecast origin i and forecast horizon h. So the value in row 40 and column 3 is a 3-step error made at time 40, for time period 43.