I am working with time series 551 of the monthly data of the M3 competition.
So, my data is :
library(forecast)
library(Mcomp)
# Time Series
# Subset the M3 data to contain the relevant series
ts.data<- subset(M3, 12)[[551]]
print(ts.data)
I want to implement time series cross-validation for the last 18 observations of the in-sample interval.
Some people would normally call this “forecast evaluation with a rolling origin” or something similar.
How can i achieve that ? Whats means the in-sample interval ? Which is the timeseries i must evaluate?
Im quite confused , any help in order to light up this would be welcome.
The tsCV
function of the forecast
package is a good place to start.
From its documentation,
tsCV(y, forecastfunction, h = 1, window = NULL, xreg = NULL, initial = 0, . ..)
Let ‘y’ contain the time series y[1:T]. Then ‘forecastfunction’ is applied successively to the time series y[1:t], for t=1,...,T-h, making predictions f[t+h]. The errors are given by e[t+h] = y[t+h]-f[t+h].
That is first tsCV fit a model to the y[1] and then forecast y[1 + h], next fit a model to y[1:2] and forecast y[2 + h] and so on for T-h steps.
The tsCV function returns the forecast errors.
Applying this to the training data of the ts.data
# function to fit a model and forecast
fmodel <- function(x, h){
forecast(Arima(x, order=c(1,1,1), seasonal = c(0, 0, 2)), h=h)
}
# time-series CV
cv_errs <- tsCV(ts.data$x, fmodel, h = 1)
# RMSE of the time-series CV
sqrt(mean(cv_errs^2, na.rm=TRUE))
# [1] 778.7898
In your case, it maybe that you are supposed to