Search code examples
rforecastingarima

One Step Ahead Forecasting in R


I am using these minor data points to forecast the following intervals via one-step ahead Forecasting. For that, I have built a custom function to execute this but whenever I try to print the next interval it won't prints the value for 2022. I would appreciate it if someone would help me with this to forecast next year.

My data:

structure(list(Year = c(2012, 2013, 2014, 2015, 2016, 2017, 2018, 
2019, 2020, 2021), Adm.Numbers = c(1660, 1726, 1846, 1955, 2026, 
1999, 1954, 1924, 1952, 2078)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -10L))

Code:

summary(Ad1)
plot(ad_1, type="b", col = "black")

Arima_prediction_1 <- function(k) {
  
  range_validation <- 3
  n_ahead <- 1
  
  train_tbl <- Ad1 %>% slice((1 + k):(2 + k))
  valid_tbl <- Ad1 %>% slice((2 + 1 + k):(2 + k + range_validation)) 
  test_tbl  <- Ad1 %>% slice((2 + k + range_validation + 1):(2 + k + range_validation + n_ahead))
  
  train_arima <- bind_rows(train_tbl, valid_tbl) %>% select(1:2)
  test_arima <- test_tbl %>% select(1:2)
  
  # ARIMA model: 
  my_arima <- auto.arima(train_arima[, 2] %>% ts(start = 1))
  
  # Use the model for forecasting: 
  predicted_arima <- forecast(my_arima, h = 1)$mean %>% as.vector()
  
  actual_predicted_df_test <- test_arima %>% 
    mutate(predicted = predicted_arima) 
  
  return(actual_predicted_df_test)
  
}

options(scipen = 9999)
lapply(0:5, Arima_prediction_1) ->> arima_results
arima_results <- do.call("bind_rows", arima_results)
view(arima_results)

Solution

  • you do not see 2022 as you do not have this number/year in your dataframe (Ad1), which you use go extract the info from.

    So you need to twist your code a little, mainly generating a corresponding sequence of years. Instead of dplyr::slice I used directly the index selection method for dataframes and also made some minor changes to your code:

    library(forecast)
    library(dplyr)
    
    Arima_prediction_1 <- function(k) {
    
        range_validation <- 3
        n_ahead <- 1
    
        train_tbl <- Ad1[(1 + k):(2 + k), ]
        valid_tbl <- Ad1[(2 + 1 + k):(2 + k + range_validation), ] 
                                # get  max year from validation data and generate sequence to max year plus nhead and exlude first vector item
        test_tbl  <- data.frame(Year = seq(from = max(valid_tbl$Year), 
                                           to = max(valid_tbl$Year) + n_ahead)[-1],
                                Adm.Numbers = Ad1[(2 + k + range_validation + 1):(2 + k + range_validation + n_ahead), 2])
    
        train_arima <- rbind(train_tbl, valid_tbl) 
        test_arima <- test_tbl
    
        # ARIMA model: 
        my_arima <- forecast::auto.arima(ts(train_arima[, 2], start = 1))
    
        # Use the model for forecasting: 
        predicted_arima <- forecast::forecast(my_arima, h = 1)$mean %>% as.vector()
    
        actual_predicted_df_test <- test_arima %>% 
            dplyr::mutate(predicted = predicted_arima) 
    
        return(actual_predicted_df_test)
    
    }
    
    arima_results <- lapply(0:5, Arima_prediction_1) 
    do.call("bind_rows", arima_results)
    
      Year Adm.Numbers predicted
    1 2017        1999  2026.000
    2 2018        1954  1981.793
    3 2019        1924  1956.000
    4 2020        1952  1971.600
    5 2021        2078  1971.000
    6 2022          NA  1981.400
    

    I do not quite understand why the valid_tbl is separated from the train_tbl just to be merged/united/bound before actual calculations ... possibly you reduced the code complexity for the reprex.