Search code examples
rtime-serieshierarchical-dataforecasting

ARIMA with regressions for Hierarchical data forecast


I am getting this error when trying to use ARIMA with regressions on a gts object:

Error in ...fourier(x, K, length(x) + (1:h)) : K must be not be greater than period/2

Here's a simple reproducible piece of code. What should I set k to? I've tried different values but none seem to work.

library(hts)
y3 <- ts(matrix(rnorm(300),ncol=60,nrow=5))
blnames3 <- paste0(rep(c("CA", "NY"), each = 30), # State
               rep(c("AL", "LA", "CL", "ES"), each = 15), # County
               rep(c("O", "O", "O", "C", "C"), 12), # Industry
               rep(c("p", "q", "r", "p", "q"), 12),  # Sub-industry
               rep(504:507, 15)) # Product
colnames(y3) <- blnames3

gy3 <- gts(y3, characters=list(c(2,2),c(1,1,3)))

i=5
fc <- forecast(gy3, fmethod="arima", seasonal=FALSE, h=6, xreg=fourier(gy3, K=i), newxreg=fourierf(gy3, K=i, h=6))

Solution

  • There are several problems here. First, your time series contain only 5 observations each which is too few to fit any model for forecasting purposes. Second, you don't specify the frequency of the data in the ts() call. Third, you cannot pass a gts object as the first argument to fourier().

    Here is some code that works:

    library(hts)
    y3 <- ts(matrix(rnorm(3000),ncol=60), frequency=12)
    blnames3 <- paste0(rep(c("CA", "NY"), each = 30), # State
                       rep(c("AL", "LA", "CL", "ES"), each = 15), # County
                       rep(c("O", "O", "O", "C", "C"), 12), # Industry
                       rep(c("p", "q", "r", "p", "q"), 12),  # Sub-industry
                       rep(504:507, 15)) # Product
    colnames(y3) <- blnames3
    
    gy3 <- gts(y3, characters=list(c(2,2),c(1,1,3)))
    
    i <- 5
    fc <- forecast(gy3, fmethod="arima", seasonal=FALSE, h=6,
            xreg=fourier(y3[,1], K=i), newxreg=fourierf(y3[,1], K=i, h=6))
    

    Note that the first argument to fourier and fourierf is only used to determine the frequency and length of the Fourier predictors. So just using the first column of y3 is sufficient.