Search code examples
rtime-seriesforecasting

Bad performance of TBATS and NNETAR functions when forecasting on a long-term basis in R


I have been using the tbats and nnetar functions from the forecast package to produce an hourly electric load forecast with a forecasting horizon of a week and a month, and both models perform satisfactorily. My data set comprises of hourly values from January 2017 up to early May 2022 (46848 values). However, when I try to make an hourly load forecast up to the end of the year (07/05/2022-31/12/2022, 5736 hourly values), the results are either flat or lose seasonality. Does anyone have any idea why the long-term forecast gives such poor results? Any idea on either model will be highly appreciated. I apologise for the very large data set.

I have uploaded the data set on git hub:

df <- read.csv(file = "https://raw.githubusercontent.com/Argiro1983/Load/LOAD/LOAD_2017_2022.csv", sep=";")
#fix datetime
df$TIME<- with(df, sprintf("%02d:00", TIME-1))
df$DATE<-as.Date(df$DATE, "%d/%m/%Y")
df$TIME <- paste(df$TIME, ':00', sep = '')
View(df)

library(ggpubr)
library(chron)
df$TIME <- chron(times=df$TIME)

DATETIME<-as.POSIXct(paste(df$DATE, df$TIME), origin = "1970-01-01 00:00:00", tz="UTC", usetz=TRUE)
my_df <- data.frame(timestamp = as.POSIXct(DATETIME, format = "%d.%m.%Y %H:%M", origin = "1970-01-01 00:00:00", tz = "UTC"), input = df[,3])
my_df <- setNames(my_df, c("DATETIME","LOAD"))

Particularly the TBATS model results lose seasonality and seem strange. The code I used is the following:

library(ggplot2)
library(forecast)
library(tseries)
library(dplyr)

Load = ts(my_df[, c('LOAD')])
my_df$Clean_Load = tsclean(Load)
Clean_Load = ts(my_df[, c('Clean_Load')])
load_ts = ts(Clean_Load)

msts <- msts(load_ts, seasonal.periods=c(24,168,8760), start=c(2017,01))
plot(msts, main="Load", xlab="Year", ylab="MWh")

s <- tbats(msts)
sp<- predict(s,h=5736) 

The results are also flat when I run the nnetar function, with or without temperature as an external regressor. I have tried different lambdas, but none seems to work:

#create dataframe for temperature historical values
Temperature_history <- read.csv(file = "https://raw.githubusercontent.com/Argiro1983/Load/LOAD/Temperature_history.csv", sep=";")

DATETIME<-as.POSIXct(Temperature_history$Datetime, format = "%d/%m/%Y %H:%M", tz="UCT", usetz=TRUE)
Temperature_df <- data.frame(timestamp = as.POSIXct(DATETIME, format = "%d/%m/%Y %H:%M", tz = "UCT"), input = Temperature_history$Temperature)
Temperature_df<- setNames(Temperature_df, c("DATETIME","TEMPERATURE"))


#create dataframe for temperature forecasted values
Temperature_forecast <- read.csv(file = "https://raw.githubusercontent.com/Argiro1983/Load/LOAD/Temperature_forecast.csv", sep=";")

DATETIME2<-as.POSIXct(Temperature_forecast$datehour, format = "%d/%m/%Y %H:%M", tz="UCT", usetz=TRUE)
Temp_forecast <- data.frame(timestamp = as.POSIXct(DATETIME2, format = "%d/%m/%Y %H:%M", tz = "UCT"), input = Temperature_forecast$TEMP_FORECAST)
View(Temp_forecast)
Temp_forecast <- setNames(Temp_forecast, c("DATETIME","TEMPERATURE"))
View(Temp_forecast)

#define and run NN model
library(forecast)
myts = ts(my_df$LOAD, frequency = 24)

fit2 = nnetar(myts,xreg = Temperature_df$TEMPERATURE, lambda = 0.5, P=1, MaxNWts=1177)
nnetforecast <- forecast(fit2, xreg = Temp_forecast$TEMPERATURE, h = 5736, PI = F, npaths=100, bootstrap = TRUE)  
autoplot(nnetforecast, h = 5736)

enter image description here


Solution

  • First, your code won't work because the github link does not point to the csv file. Replace the first line as follows

    df <- read.csv(file = "https://raw.githubusercontent.com/Argiro1983/Load/LOAD/LOAD_2017_2022.csv", sep=";")
    

    Then running your code, I get reasonable results for the tbats model for the first few weeks:

    sp <- forecast(s,h=14*24) 
    autoplot(sp, include=14*24)
    

    enter image description here

    Using a time series model to forecast much further ahead makes little sense here.

    In any case, there are well-developed models for electricity demand that will do better than either TBATS or NNETAR. For a simple starting point, try Tao Hong's vanilla model, described in Section 2.2 of https://doi.org/10.1016/j.ijforecast.2015.09.006. It's just a linear regression, but it will do better than any of these models you are trying.