Search code examples

How to forecast an arima with Dynamic regression models for grouped data?

I'm trying to make a forecast of a arima with regression (Regression with ARIMA errors) to several ts at the same time and using grouped data.

I'm new in the tidy data so... Basically, I'm reproducing this example ( with a multivariate ts, and multivariate model.

here is a reproducible example:

library(tidyverse); library(tidyquant)
library(timetk); library(sweep)

# using package data

# grouping data 
monthly_qty_by_cat2 <- bike_sales %>%
  mutate(order.month = as_date(as.yearmon( %>%
  group_by(category.secondary, order.month) %>%
  summarise(total.qty = sum(quantity), price.m = mean(price))

# using nest 
monthly_qty_by_cat2_nest <- monthly_qty_by_cat2 %>%
  group_by(category.secondary) %>%

# Forecasting Workflow
# Step 1: Coerce to a ts object class
monthly_qty_by_cat2_ts <- monthly_qty_by_cat2_nest %>%
  mutate(data.ts = map(.x       = data, 
                       .f       = tk_ts, 
                       select   = -order.month,  # take off date 
                       start    = 2011, 
                       freq     = 12))

# Step 2: modeling an ARIMA(y ~ x)
# make a function to map
modeloARIMA_reg <- function(y,x) {
  result <- ARIMA(y ~ x)

# map the function 
monthly_qty_by_cat2_fit <- monthly_qty_by_cat2_ts %>%
  mutate(fit.arima = map(data.ts, modeloARIMA_reg))

Here I dont know if the map is using the right variable in y (the dependent), but I keep going try the forecast and an error appears

# Step 3: Forecasting the model
monthly_qty_by_cat2_fcast <- monthly_qty_by_cat2_fit %>%
  mutate(fcast.ets = map(fit.arima, forecast))

# this give me this error
# Erro: Problem with `mutate()` input `fcast.arima`.
# x argumento não-numérico para operador binário
# i Input `fcast.arima` is `map(fit.arima, forecast)`.
# i The error occured in group 1: category.secondary = "Cross Country Race".
# Run `rlang::last_error()` to see where the error occurred.
# Além disso: Warning message:
#   In mean.default(x, na.rm = TRUE) :
#   argument is not numeric or logical: returning NA

Two questions emerge:

I dont know how to input the mean of the independent variable (x) of each group;

AND how to declare this new data as a forecast argument.

PS: Dont need be tibble or nested result, I just need the point forecast and the CI (total.qty lo.95 hi.95)


  • Well, this code solve the problem for me. This make one forecast for each time-series (grouped tsibble) and use the own mean value of those time-series as future data in the forecast Any comment is welcome.

    # MY FLOW
    monthly_qty_by_cat2 <- 
      sweep::bike_sales %>%
      mutate(order.month = yearmonth( %>%
      group_by(category.secondary, order.month) %>%
      summarise(total.qty = sum(quantity), price.m = mean(price)) %>% 
      as_tsibble(index=order.month, key=category.secondary) # coerse in tsibble
    # mean for the future
    futuro <- monthly_qty_by_cat2 %>% 
      group_by(category.secondary) %>% 
      mutate(fut_x = mean(price.m)) %>% 
      do(price.m = head(.$fut_x,1))
    # as.numeric
    futuro$price.m <- as.numeric(futuro$price.m)
    # make values in the future
    future_x <- new_data(monthly_qty_by_cat2, 3) %>%
      left_join(futuro, by = "category.secondary")
    # model and forecast
    fc <- monthly_qty_by_cat2 %>% 
      group_by(category.secondary) %>% 
      model(ARIMA(total.qty ~ price.m))  %>%
      forecast(new_data=future_x)  %>% 
      hilo(level = 95) %>% 
    # Tidy the forecast
    fc_tibble <- fc %>%  as_tibble() %>% select(-total.qty)
    # the end

    Well this solve the problem for me. This make one forecast for each group time-series and use the own mean value of those group time-series as future data in the forecast Any comment is welcome.