Search code examples
rplottime-seriesforecasting

'Undo' transformation of timeseries (i.o. to make it stationary and with no seasonal trend) to plot forecast


I am trying to re-create the paper "Box-Jenkins Seasonal Forecasting: Problem in a Case Study" (1973) in R. Now I'm almost done, but encounter a problem with the forecast.

In order to make the data stationary and to remove the seasonality, the difference of the sales data log (df) was taken twice.

# required lib
require(forecast);

# transform data
df=read.table("Sample.txt", header = TRUE)
df
ts<-ts(df, frequency=12, start=c(1965,1))
ts
ts_log10 <- log10(ts)
ts_log10
dd12zt <- diff(diff(ts_log10, 12))
dd12zt

(Sorry, I know there are more beautiful ways to write this.) After I check the most suitable ARIMA model, I forecasted the sales ​​for the next 12 months.

# Best fitting ARIMA
ARIMAfit <- auto.arima(dd12zt)
summary(ARIMAfit)

# Forecast
fcast <- forecast(ARIMAfit, h=12)
plot(fcast)
summary(fcast)

Of course, the values ​​of the forecast do not match those of the sample. My question is now how I best 'transform' the forecast data 'back' so I can plot it along with the original time-series?

Data (Sample.txt): Sales 154 96 73 49 36 59 95 169 210 278 298 245 200 118 90 79 78 91 167 169 289 347 375 203 223 104 107 85 75 99 135 211 335 460 488 326 346 261 224 141 148 145 223 272 445 560 612 467 518 404 300 210 196 186 247 343 464 680 711 610 613 392 273 322 189 257 324 404 677 858 895 664 628 308 324 248 272

P.s. I'm sorry should I have posted something wrong. I'm still not very experienced with the stackoverflow habits. I am looking forward to tips and suggestions.


Solution

  • The auto.arima function will handle the differencing and transformations for you. The following code should do what you want.

    library(forecast)
    y <- ts(c(154,96,73,49,36,59,95,169,210,278,298,245,200,118,90,79,78,91,167,169,289,347,375,203,223,104,107,85,75,99,135,211,335,460,488,326,346,261,224,141,148,145,223,272,445,560,612,467,518,404,300,210,196,186,247,343,464,680,711,610,613,392,273,322,189,257,324,404,677,858,895,664,628,308,324,248,272),
             frequency=12, start=c(1965,1))
    fit <- auto.arima(y, lambda=0, d=1, D=1)
    fcast <- forecast(fit)
    plot(fcast)
    

    However, I think a log transformation is too strong. I'd use something like lambda=0.4. Also, you don't really need the first difference, the seasonal difference is sufficient. If you just let auto.arima handle it for you, the results are pretty good:

    fit <- auto.arima(y, lambda=0.4)
    fcast <- forecast(fit)
    plot(fcast)
    

    enter image description here