Search code examples
rstatisticstime-seriesforecasting

How to forecast a Time Series Regression Models with Distributed Lag (using dLagM)?


I am trying to forecast a time series with distributed lag (using dLagM). I guess I can fit the model properly, it shows all the expected results. But I am unable to forecast any value. The error, at least for me, is opaque.

I guess it is something to do with my dummy variables and its lags, but I cannot figure it out by myself, so after a couple of days stranded I call for help!

Here is a reproducible example. It uses the dummies and the lags proposed by previous work.

# data
df <- dplyr::tribble(
     ~y ,    ~x,   ~dummy1, ~dummy2,
   207.414  , 59.717     ,  0    ,  0    , 
   177.416  , 59.576     ,  0    ,  0    , 
   245.526  , 63.288     ,  0    ,  0    , 
   276.641  , 61.801     ,  0    ,  0    , 
   371.803  , 58.529     ,  0    ,  0    , 
   519.777  , 56.790     ,  1    ,  0    , 
   430.641  , 54.012     ,  0    ,  1    , 
   251.612  , 57.151     ,  0    ,  0    , 
   269.787  , 57.480     ,  0    ,  0    , 
   230.034  , 60.042     ,  0    ,  0    , 
   202.376  , 60.280     ,  0    ,  0    , 
   253.497  , 61.323     ,  0    ,  0    , 
   239.166  , 61.235     ,  0    ,  0    , 
   272.894  , 60.206     ,  0    ,  0    , 
   293.951  , 62.020     ,  0    ,  0    , 
   278.437  , 61.393     ,  0    ,  0    , 
   424.190  , 58.876     ,  0    ,  0    , 
   652.256  , 56.978     ,  1    ,  0    , 
   536.587  , 56.381     ,  0    ,  1    , 
   263.116  , 61.193     ,  0    ,  0    , 
   289.288  , 60.123     ,  0    ,  0    , 
   227.690  , 60.957     ,  0    ,  0    , 
   234.306  , 62.563     ,  0    ,  0    , 
   293.728  , 61.540     ,  0    ,  0     )

# new auxiliary data to be used as input to forecast y for 12 periods
newdata <- dplyr::tribble(
  ~x,   ~dummy1, ~dummy2,
  61.903     ,  0    ,  0    , 
  60.594     ,  0    ,  0    , 
  63.358     ,  0    ,  0    , 
  65.178     ,  0    ,  0    , 
  64.275     ,  0    ,  0    , 
  59.872     ,  1    ,  0    , 
  59.273     ,  0    ,  1    , 
  59.665     ,  0    ,  0    , 
  58.643     ,  0    ,  0    , 
  63.354     ,  0    ,  0    , 
  65.743     ,  0    ,  0    , 
  65.158     ,  0    ,  0    )



# Model ARDL(1,4)
model = dLagM::ardlDlm(formula = y ~ x + dummy1 + dummy2 ,
                     data = df, 
                     p = 1 , # lag; given by previous analysis
                     q = 4, # order of autoregressive process; given by previous analysis
                     remove = list(p = list(dummy1 = c(1:1), 
                                            dummy2 = c(1:1)))
                    )

# transposed (for dLagM::forecast)
transposed_newdata <- t(newdata)

# forecasting
fLeves <- dLagM::forecast(model,
                   x = transposed_newdata, 
                   h = nrow(newdata),
                   interval = TRUE, 
                   level = 0.95 , 
                   nSim = 100)

# Error
# Error in if (n == 0) return(v) : missing value where TRUE/FALSE needed

Any help is much appreciated!


Solution

  • Thanks for posting this. There was a bug in the code related to the variable names. I fixed the issue in the new version of dLagM 1.1.8. Also, please note that you need to avoid nested variable names working with version 1.1.8. For example, if you have "x1" as a variable, avoid using a name that has "x1" in it, such as "x11". I'll fix this issue in the next version.