I am using auto.arima()
from forecast
package and am coming into some strange results with the prediction.
library(forecast)
x <- structure(c(1.92, 2.1, 1.73, 1.35, 1.29, 1.35, 1.42, 1.46, 1.6,
1.67, 1.98, 1.78, 1.77, 2.35, 1.93, 1.43, 1.29, 1.26, 1.93, 2.33,
2.22, 2.19, 2.15, 2.25, 3.12, 3.32, 2.72, 2.28, 2.28, 2.16, 2.81,
3.12, 2.85, 2.98, 3.3, 3.06, 3.56, 3.81, 3.48, 2.64, 2.91, 3.35,
3.73, 3.58, 4, 3.94, 3.79, 3.85), .Tsp = c(2012, 2015.91666666667,
12), class = "ts")
fit <- auto.arima(x)
plot(forecast(fit, 12)) #forecast and actual data
f2 <- fitted.values(fit)
lines(f2, col="red") #add predicted values during training
I don't understand how fitted value (red line) is very close to observed value (black) but then there is a such a big jump in the first forecast.
Any ideas why we see this jump? I've seen other posts on Stack Exchange where the xreg
option was used but this is not doing that so I haven't been able to track down a similar post.
Generally I tend to believe that auto.arima
slightly overfits data. Some quick exploratory analysis with ACF shows that (0,1,2)(0,1,0)[12]
is already a decent model. I will use arima0
from R base to fit this model:
fit0 <- arima0(x, order = c(0,1,2), seasonal = c(0,1,0))
Prediction / forecasting is done with predict.arima0
:
pred <- predict(fit0, n.ahead = 12, se.fit = FALSE)
Let's plot observed series and forecast together:
ts.plot(x, pred, col = 1:2)
There is still a jump. But the variation is fairly reasonable, compared with the variability of the series.
Nothing wrong. When we forecast x[49]
from x[1:48]
, it will be different from x[48]
. Typically, (0,1,2)(0,1,0)[12]
has a linear trend plus a seasonal effect. It helps to visualize your time series and prediction season by season:
ts.plot(window(x, 2012, 2012 + 11/12),
window(x, 2013, 2013 + 11/12),
window(x, 2014, 2014 + 11/12),
window(x, 2015, 2015 + 11/12),
pred, col = 1:5)