Search code examples
rdplyrforecast

Dplyr Pipe Forecast next 5 items by Group


I am trying to forecast by each group within a data frame (in this case LSOA), for the next 5 years. I have a data set of three columns: LSOA, Date and Value. Similar to this:

LSOA Date Value
E01026449 31/03/2021 401
E01026449 31/03/2022 415
E01026449 31/03/2023 441
E01026450 31/03/2021 413
E01026450 31/03/2022 428
E01026450 31/03/2023 440
E01026451 31/03/2021 607
E01026451 31/03/2022 625
E01026451 31/03/2023 633

I have tried several nested lists solutions, none of which are working as it is just fitting the existing year values and I am not sure where to put the predict and h= to get the next x results.

My completely broken code below:

datamodel<-split(data[, -1], data$LSOA)

ld <- lapply(datamodel, function(x) {ts(c(t(x[,-2])),start = c(2010,3,31), frequency = 1)})
lest<-lapply(ld, function(x){holt(x)})

lts<- lapply(lest, function(x){predict(x, newdata=1)})

lts <- lapply(ld, holt, model = "nZZ")

I know I need to:

1.) Group by LSOA
2.) Develop a model for each group
3.) Apply model to prediction for the group

So ideally I would be able to predict and append the 31/03/2024 number for each LSOA or set h to some number of future predictions. But I am missing something silly here.

How can I achieve this all in a dplyr pipe?


Solution

  • You can use the map() function from purrr on the levels of LSOA.It creates a list with your sub data.tables. Then, using map again, you can throw any prediction on each one of your data.table, returning a list of your models.

    library(tidyverse)
    
    data=tribble(~LSOA,~Date,~Value,
                 "E01026449","31/03/2021",401,
                 "E01026449",   "31/03/2022",   415,
                 "E01026449" ,  "31/03/2023" ,  441,
                 "E01026450"    ,"31/03/2021 "  ,413,
                 "E01026450",   "31/03/2022",   428,
                 "E01026450" ," 31/03/2023" ,   440,
                 "E01026451"    ,"31/03/2021"   ,607,
                 "E01026451",   "31/03/2022",   625,
                 "E01026451" ,  "31/03/2023" ,  633,)
    
    levels (as.factor(data$LSOA)) %>% 
      map(~{return(data %>% filter(LSOA==.x))}) %>%
      map(~{#Insert prediction here
    #For example 
    lm(Value~Date, data=.x)
    # Will not work beacause of the formats
                     })
               
    

    I didn't get what you wanted to predict, sorry.