Search code examples
rforecasting

passing sparse xreg to stlf in R causes optimisation error


I am trying to forecast a time series, and regress on temperature. The residuals show a different behaviour at low and high temperatures so I want to use piecewise linear approach, so learn different coeffecients for temperatures above and below 35 degrees. The data is in a dataframe data$x, data$Season, data$Temp.

#Create data frame
len<-365*3 + 1 +31 
x<-rnorm(len,mean=4000000,sd=100000)
Season<-c(rep(3,62),rep(4,91),rep(1,90),rep(2,92),rep(3,92),rep(4,91),rep(1,90),rep(2,92),rep(3,92),rep(4,91),rep(1,91),rep(2,92),rep(3,61))
Temp<-rnorm(len,mean=20,sd=5)
data<-data.frame(x,Season,Temp)
#Create model matrix 
season_dummy<-model.matrix(~as.factor(data$Season)+0)
Temp_max=pmax(0,data$Temp-35) # creates 0, or a difference
Temp_restore<-restore_temp_up(Temp_max,data$Temp,35) # restores difference to original value
Temp_season_matrix_max=Temp_restore * season_dummy

#Create time-series and forecast
data_ts<-ts(data$x[1:1000],freq=365,start=c(2009,182))
len_train<-length(data_ts)
xreg1<-Temp_season_matrix_max[1:len_train,]
newxreg1<-Temp_season_matrix_max[(len_train+1):(len_train+30),]     
 stlf(data_ts,method="arima",h=30,xreg=xreg1,newxreg=newxreg1,s.window="periodic")

 >   Error in optim(init[mask], armaCSS, method = optim.method, hessian = FALSE,  : 
      non-finite value supplied by optim



Error in auto.arima(x, xreg = xreg, seasonal = FALSE, ...) : 
  No suitable ARIMA model found
In addition: Warning message:
In auto.arima(x, xreg = xreg, seasonal = FALSE, ...) :
  Unable to calculate AIC offset
> 

Other threads suggest changing method solver from CSS to ML, but I cant edit these parameters in stlf. The help file shows an optional parameter "forecastfunction" but there are no examples of real explanation how to use it.

Note - when I set the min temperature to say 20, instead of 35, this works ok - I am sure it is because the xreg matrix containing temperatures above 35 degress is sparse (most temperatures are below this value), but I am not sure how to get around this. (I have included code for restore_temp_up - possibly inefficient, but included here for question completion.)

restore_temp_up<-function(x,original,k){
  if(!is.vector(x))
    stop('x must be a vector')
  for (i in 1:length(x)){
    if(!is.na(x[i])){
      if (x[i] > 0){
        x[i]<-x[i]+k
      }
      if (original[i] == k){
        x[i]<-original[i]  ## this is the case if original WAS =k, then dont know whether original is 0, 
      }
    }
  }
  return(x)
}

Solution

  • Your design matrix is rank deficient so the regression is singular. To see this:

    > eigen(t(xreg1) %*% xreg1)$val
    [1] 1321.223    0.000    0.000    0.000
    

    You cannot fit a regression model with a rank deficient design matrix.