Search code examples
rtime-seriesforecasting

Why do I see a jump in my time series forecast?


I am using auto.arima() from forecast package and am coming into some strange results with the prediction.

library(forecast)

x <- structure(c(1.92, 2.1, 1.73, 1.35, 1.29, 1.35, 1.42, 1.46, 1.6, 
1.67, 1.98, 1.78, 1.77, 2.35, 1.93, 1.43, 1.29, 1.26, 1.93, 2.33, 
2.22, 2.19, 2.15, 2.25, 3.12, 3.32, 2.72, 2.28, 2.28, 2.16, 2.81, 
3.12, 2.85, 2.98, 3.3, 3.06, 3.56, 3.81, 3.48, 2.64, 2.91, 3.35, 
3.73, 3.58, 4, 3.94, 3.79, 3.85), .Tsp = c(2012, 2015.91666666667, 
12), class = "ts")

fit <- auto.arima(x)

plot(forecast(fit, 12)) #forecast and actual data
f2 <- fitted.values(fit)
lines(f2, col="red") #add predicted values during training

enter image description here

I don't understand how fitted value (red line) is very close to observed value (black) but then there is a such a big jump in the first forecast.

Any ideas why we see this jump? I've seen other posts on Stack Exchange where the xreg option was used but this is not doing that so I haven't been able to track down a similar post.


Solution

  • Generally I tend to believe that auto.arima slightly overfits data. Some quick exploratory analysis with ACF shows that (0,1,2)(0,1,0)[12] is already a decent model. I will use arima0 from R base to fit this model:

    fit0 <- arima0(x, order = c(0,1,2), seasonal = c(0,1,0))
    

    Prediction / forecasting is done with predict.arima0:

    pred <- predict(fit0, n.ahead = 12, se.fit = FALSE)
    

    Let's plot observed series and forecast together:

    ts.plot(x, pred, col = 1:2)
    

    arima0

    There is still a jump. But the variation is fairly reasonable, compared with the variability of the series.

    Nothing wrong. When we forecast x[49] from x[1:48], it will be different from x[48]. Typically, (0,1,2)(0,1,0)[12] has a linear trend plus a seasonal effect. It helps to visualize your time series and prediction season by season:

    ts.plot(window(x, 2012, 2012 + 11/12),
            window(x, 2013, 2013 + 11/12),
            window(x, 2014, 2014 + 11/12),
            window(x, 2015, 2015 + 11/12),
            pred, col = 1:5)
    

    by season