Search code examples
rtime-seriesforecastingfable

How to know the best FASSTER formula


My data structure is in the image below and has hourly intervals. I need to forecast the Demand.

# A tsibble: 23,400 x 6 [1h] <UTC>
          Date           Demand WeekDay DaysAfterHoliday Influenza MAX_Temperature
        <dttm>            <int>   <int>            <int>     <dbl>           <dbl>
 1 2017-05-01 00:00:00    122       1                0      1               19.2
 2 2017-05-02 01:00:00    124       2                1      3.04            25.3

...

I know that in a day after a holiday the number of patients in the ED is higher than usual but I can't make sure that the model is taking that into account. The data has daily, weekly and annual seasonality (especially for fixed holidays).

For multiple seasonality I can use FASSTER to handle holiday effects. I read the r documentation page on this and some presentation but in those cases the seasonality and the formula of the forecast is given to the function like this:

# NOT RUN {
cbind(mdeaths, fdeaths) %>%
  as_tsibble %>%
  model(FASSTER(mdeaths ~ fdeaths + poly(1) + trig(12)))

# }

Is there a way to make FASSTER search the most adequate formula? If not how can I know which is the best approach?

Thank you in advance!


Solution

  • The fasster package currently doesn't provide any facilities for automatic model selection (https://github.com/tidyverts/fasster/issues/50).

    To identify an appropriate fasster model specification, you can start by graphically exploring your data to identify its structure. Some questions you may consider include:

    • Is your data seasonal? Which seasonal periods are required?
      Include seasonality with fourier terms via fourier(period, K) or season(period). Generally using fourier() terms are better, as being able to specify the number of harmonics (K) allows you to control the smoothness of the seasonality and reduce model parameters.
    • Does your data include an level or local trends?
      Include a level with poly(1) or a trend with poly(2).
    • Are there potential exogenous regressors (a good example of this is temperature in electricity demand).
      Include exogenous regressors in the same way as you would in lm().
    • Do the patterns in your data alternate in predictable ways (for example, seasonality on weekdays and weekends.)
      Use %S% to switch between these patterns. For example to have a different seasonal pattern for weekdays and weekends you may consider day_type %S% (fourier("day", K = 7)), where day_type is a variable in your model that specifies if the day is a weekday or weekend.

    A simple approach to capturing the increase in patients after a holiday would be to include DaysAfterHoliday as an exogenous regressor. As this relationship is likely non-linear, you may need to also include some non-linear transformations of this variable as exogenous regressors.