Search code examples
rtime-seriestidyverseforecastingforecast

R - How to forecast by group for a daily time series with multiple variables


I am new to timeseries forecasting by groups.

I have a large daily timeseries dataset for which I need to do forecasting.

I did a lot of googling and tried a lot of different ways with no success.

date    country device  os  browser visits  clicks  logins  sale
7/29/2018   USA desktop Windows Firefox 3046    1523    762 381
7/29/2018   USA mobile  Windows Firefox 6546    3273    1637    818
7/29/2018   USA tablet  Windows Firefox 864 432 216 108
7/30/2018   USA desktop Windows Firefox 11004   5502    2751    1376
7/30/2018   USA mobile  Windows Firefox 7938    3969    1985    992
7/30/2018   USA tablet  Windows Firefox 1114    557 279 139
7/31/2018   USA desktop Windows Firefox 10814   5407    2704    1352
7/31/2018   USA mobile  Windows Firefox 7560    3780    1890    945
7/31/2018   USA tablet  Windows Firefox 984 492 246 123

This is an example dataset I generated as I could not find any other open dataset that could properly represent my problem. (apologies if the sample numbers are bad)

I wish to forecast daily 'visits,'clicks', 'logins', 'sales' for the next 'n' days on this dataset by 'country','device','os' and 'browser'

Any help would be highly appreciated.


Solution

  • This is exactly the use-case for which we are developing the tsibble and fable packages. tsibble is on CRAN (https://cran.r-project.org/package=tsibble), while fable is still only on github (https://github.com/tidyverts/fable).

    You could do something like this to forecast clicks by country, device, os and browser:

    library(tsibble)
    library(fable)
    mydata <- tsibble(dataframe, key = c(country, device, os, browser), index=date)
    mydata %>%
      model(ETS(clicks)) %>%
      forecast()