Search code examples
rforecastfable-rprophet

Future dataset is incomplete when using Fable Prophet


I'm trying to view the out of sample performance scores after running fable prophet. Please note, the forecast is grouped based on type and the forecast is looking 5 observations ahead.

Here is the code:

library(tibble)
library(tsibble)
library(fable.prophet)

lax_passengers <- read.csv("https://raw.githubusercontent.com/mitchelloharawild/fable.prophet/master/data-raw/lax_passengers.csv")


library(dplyr)
library(lubridate)
lax_passengers <- lax_passengers %>%
  mutate(datetime = mdy_hms(ReportPeriod)) %>%
  group_by(month = yearmonth(datetime), type = Domestic_International) %>%
  summarise(passengers = sum(Passenger_Count)) %>%
  ungroup()

lax_passengers <- as_tsibble(lax_passengers, index = month, key = type)
fit <- lax_passengers %>% 
  model(
    mdl = prophet(passengers ~ growth("linear") + season("year", type = "multiplicative")),
  )
fit

test_tr <- lax_passengers %>%
  slice(1:(n()-5)) %>%
  stretch_tsibble(.init = 12, .step = 1)


fc <- test_tr %>%
  model(
    mdl = prophet(passengers ~ growth("linear") + season("year", type = "multiplicative")),
  ) %>%
  forecast(h = 5)


fc %>% accuracy(lax_passengers)

When I run fc %>% accuracy(lax_passenger), I get the following warning:

Warning message:
The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 
5 observations are missing between 2019 Apr and 2019 Aug 

How do make the future dataset complete as I believe the performance score isn't accurate based on the missing 5 observations.

It seems like when I try to stretch the tsibble, it doesn't slice correctly as it doesn't remove the last 5 observations from each type.


Solution

  • The slice() function removes rows from the entire dataset, so it is only removing the last 5 rows from your last key (type=="International"). To remove the last 5 rows from all keys, you'll need to group by keys and slice.

    test_tr <- lax_passengers %>%
      group_by_key() %>% 
      slice(1:(n()-5)) %>%
      ungroup() %>% 
      stretch_tsibble(.init = 12, .step = 1)