I'm wondering how I can adapt Rob Hyndman's use of fourier terms like in this blog post to forecast weekly time series data with an additional regressor. Below is my attempt, but I get an error reading that xreg is rank deficient
library(forecast)
gascsv <- read.csv("https://robjhyndman.com/data/gasoline.csv", header=FALSE)[,1]
gas<- ts(gascsv[1:300], freq=365.25/7, start=1991+31/365.25)
#assume that gasreg is an additional regressor used to forecast gas
gasreg <- ts(gascsv[301:600], freq=365.25/7, start=1991+31/365.25)
bestfit <- list(aicc=Inf)
for(i in 1:25){
for(j in 1:25){
fit <- auto.arima(gas, xreg=cbind(fourier(gas, K=i),fourier(gasreg,K=j)), seasonal=FALSE)
if(fit$aicc < bestfit$aicc){
bestfit <- fit
k <-i
l <- j
}
else break;
}
}
Thanks!
Edit: After some additional digging around online, I've found some materials that seem helpful. Another of Rob's blog posts uses a set of Fourier terms as well as a dummy variable as regressors. This post on kaggle (see 3. ARIMA model) uses multiple Fourier terms in a way very similar to what I'm doing, although I still receive the xreg is rank deficient
error. Could this be caused by gasreg
being the same data as gas?
fourier(gas, K=i)
and fourier(gasreg,K=j)
produce the same fourier set- I believe that the results of fourier()
only depend on the length of the time series, and not the content. The rank deficiency error was being caused by using the same regressor twice. I don't think I need to input the fourier series twice, and the code below seems to suffice.
bestfit <- list(aicc=Inf)
for(i in 1:25){
for(j in 1:25){
fit <- auto.arima(gas, xreg=cbind(fourier(gas, K=i),gasreg), seasonal=FALSE)
if(fit$aicc < bestfit$aicc){
bestfit <- fit
k <-i
}
else break;
}
}