Implementation of time series cross-validation

I am working with time series 551 of the monthly data of the M3 competition.

So, my data is :

library(forecast)
library(Mcomp)
# Time Series
# Subset the M3 data to contain the relevant series 
ts.data<- subset(M3, 12)[[551]]
print(ts.data)

I want to implement time series cross-validation for the last 18 observations of the in-sample interval.

Some people would normally call this “forecast evaluation with a rolling origin” or something similar.

How can i achieve that ? Whats means the in-sample interval ? Which is the timeseries i must evaluate?

Im quite confused , any help in order to light up this would be welcome.

Solution

The tsCV function of the forecast package is a good place to start.

From its documentation,

tsCV(y, forecastfunction, h = 1, window = NULL, xreg = NULL, initial = 0, . ..)

Let ‘y’ contain the time series y[1:T]. Then ‘forecastfunction’ is applied successively to the time series y[1:t], for t=1,...,T-h, making predictions f[t+h]. The errors are given by e[t+h] = y[t+h]-f[t+h].

That is first tsCV fit a model to the y[1] and then forecast y[1 + h], next fit a model to y[1:2] and forecast y[2 + h] and so on for T-h steps.

The tsCV function returns the forecast errors.

Applying this to the training data of the ts.data

# function to fit a model and forecast
fmodel <- function(x, h){
  forecast(Arima(x, order=c(1,1,1), seasonal = c(0, 0, 2)), h=h)
}
 
# time-series CV
cv_errs <- tsCV(ts.data$x, fmodel, h = 1)

# RMSE of the time-series CV
sqrt(mean(cv_errs^2, na.rm=TRUE))
# [1] 778.7898

In your case, it maybe that you are supposed to

fit a model to ts.data$x and then forecast ts.data$xx[1]
fit mode the c(ts.data$x, ts.data$xx[1]) and forecast(ts.data$xx[2]), so on.