I am trying to do Holt's forecast for multiple timeseries and combine them with my original data.frame. Consider the following data.frame, where I have two population groups:
library("forecast")
d <- data.frame(SEX = c("MALE","MALE","MALE","FEMALE","FEMALE","FEMALE"),
EDUCATION = c("01","01","01","01","01","01"),
TIME = c("2000","2001","2002","2000","2001","2002"),
VALUE = c(120,150,140,90,75,60))
Then I am doing the Holt's forecast for the two time series:
male <- ts(as.numeric(d[1:3,]$VALUE),start=c(2000))
female <- ts(as.numeric(d[4:6,]$VALUE),start=c(2000))
forecastmale <- holt(male,h = 3,damped = FALSE)
forecastfemale <- holt(female,h = 3,damped = FALSE)
Then I save the result and combine with my original data.frame:
forecastmale <- data.frame(forecastmale[["mean"]])
forecastfemale <- data.frame(forecastfemale[["mean"]])
forecastmale$SEX <- c("MALE","MALE","MALE")
forecastmale$EDUCATION <- c("01","01","01")
forecastmale$TIME <- c("2003","2004","2005")
colnames(forecastmale)[1] <- "VALUE"
forecastmale <- forecastmale[, c(2,3,4,1)]
forecastfemale$SEX <- c("FEMALE","FEMALE","FEMALE")
forecastfemale$EDUCATION <- c("01","01","01")
forecastfemale$TIME <- c("2003","2004","2005")
colnames(forecastfemale)[1] <- "VALUE"
forecastfemale <- forecastfemale[, c(2,3,4,1)]
d <- rbind(d,forecastmale,forecastfemale)
This works when I only have two time series. But if I have like 100 time series that has to be forecasted, then it is not a very efficient way do to it. Can anyone help with make the coder more efficient, so if I for instance include an extra population group in my data.frame, then I do not have change anything in the code?
This is what the fable
package is designed to handle. Here is an example using the same data structure that you have.
library(dplyr)
library(tsibble)
library(fable)
# Artifical data
df <- expand.grid(
education = 1:3,
sex = c("male","female"),
year = 1990:2002
) %>%
as_tsibble(index=year, key=c(sex,education)) %>%
mutate(value = rnorm(78))
# Fit Holt's method to each series and forecast 3 years ahead
df %>%
model(holt = ETS(value ~ trend("A"))) %>%
forecast(h=3)
#> # A fable: 18 x 6 [1Y]
#> # Key: sex, education, .model [6]
#> sex education .model year value .mean
#> <fct> <int> <chr> <dbl> <dist> <dbl>
#> 1 male 1 holt 2003 N(0.14, 1.7) 0.137
#> 2 male 1 holt 2004 N(0.17, 1.7) 0.171
#> 3 male 1 holt 2005 N(0.21, 1.7) 0.205
#> 4 male 2 holt 2003 N(-0.75, 1.5) -0.749
#> 5 male 2 holt 2004 N(-0.84, 1.8) -0.837
#> 6 male 2 holt 2005 N(-0.93, 2) -0.926
#> 7 male 3 holt 2003 N(0.51, 0.7) 0.514
#> 8 male 3 holt 2004 N(0.53, 0.7) 0.530
#> 9 male 3 holt 2005 N(0.55, 0.7) 0.546
#> 10 female 1 holt 2003 N(0.44, 0.98) 0.445
#> 11 female 1 holt 2004 N(0.47, 0.98) 0.470
#> 12 female 1 holt 2005 N(0.5, 0.98) 0.495
#> 13 female 2 holt 2003 N(0.13, 0.89) 0.127
#> 14 female 2 holt 2004 N(0.15, 0.89) 0.148
#> 15 female 2 holt 2005 N(0.17, 0.89) 0.168
#> 16 female 3 holt 2003 N(0.78, 1.8) 0.781
#> 17 female 3 holt 2004 N(0.88, 1.8) 0.880
#> 18 female 3 holt 2005 N(0.98, 1.8) 0.978
Created on 2020-09-05 by the reprex package (v0.3.0)