Search code examples
rforecast

rmse function issue in R


I have an R code that contains some nested bracket for loop within which I used rmse() function from Metrics package. I tried it without the function and it worked, but inside my nested R code it does not.

Here is what I desire to do with R

  1. I have generated a 50-time series dataset.
  2. I lice the same time series dataset into chunks of the following sizes: 2,3,...,48,49 making me have 48 different time series formed from step 1 above.
  3. I divided each 48-time series dataset into train and test sets so I can use rmse function in Metrics package to get the Root Mean Squared Error (RMSE) for the 48 subseries formed in step 2.
  4. The RMSE for each series is then tabulated according to their chunk sizes
  5. I obtained the best ARIMA model for each 48 different time series data set.

My R code

 # simulate arima(1,0,0)
 library(forecast)
 library(Metrics)
 n <- 50
 phi <- 0.5
 set.seed(1)
 wn <- rnorm(n, mean=0, sd=1)
    ar1 <- sqrt((wn[1])^2/(1-phi^2))
 for(i in 2:n){
   ar1[i] <- ar1[i - 1] * phi + wn[i]
 }
 ts <- ar1

 t<-length(ts)# the length of the time series
 li <- seq(n-2)+1 # vector of block sizes(i.e to be between 1 and n exclusively)

 RMSEblk<-matrix(nrow = 1, ncol = length(li))#vector to store block means
 colnames(RMSEblk)<-li
 for (b in 1:length(li)){
     l<- li[b]# block size
     m <- ceiling(t / l) # number of blocks
     blk<-split(ts, rep(1:m, each=l, length.out = t)) # divides the series into blocks
     singleblock <- vector() #initialize vector to receive result from for loop
     for(i in 1:10){
         res<-sample(blk, replace=T, 100) # resamples the blocks
         res.unlist<-unlist(res, use.names = F) # unlist the bootstrap series
         # Split the series into train and test set
         train <- head(res.unlist, round(length(res.unlist) * 0.6))
         h <- length(res.unlist) - length(train)
         test <- tail(res.unlist, h)

        # Forecast for train set
        model <- auto.arima(train)
        future <- forecast(test, model=model,h=h)
        nfuture <- as.numeric(out$mean) # makes the `future` object a vector
        # use the `rmse` function from `Metrics` package
        RMSE <- rmse(test, nn)
        singleblock[i] <- RMSE # Assign RMSE value to final result vector element i
    }
    #singleblock
    RMSEblk[b]<-mean(singleblock) #store into matrix
 }
 RMSEblk

The error I got

#Error in rmse(test, nn): unused argument (nn)
#Traceback:

But when I wrote

library(forecast)

train <- head(ar1, round(length(ar1) * 0.6))
h <- length(ar1) - length(train)
test <- tail(ar1, h)
model <- auto.arima(train)
#forecast <- predict(model, h)
out <- forecast(test, model=model,h=h)
nn <- as.numeric(out$mean)
rmse(test, nn)

It did work

Please point out what I am missing?


Solution

  • I am able to run your code after making two very small corrections in your for loop. See the two commented lines:

     for (b in 1:length(li)){
         l<- li[b]
         m <- ceiling(t / l)
         blk<-split(ts, rep(1:m, each=l, length.out = t))
         singleblock <- vector()
         for(i in 1:10){
             res<-sample(blk, replace=T, 100)
             res.unlist<-unlist(res, use.names = F)
             train <- head(res.unlist, round(length(res.unlist) * 0.6))
             h <- length(res.unlist) - length(train)
             test <- tail(res.unlist, h)
    
            model <- auto.arima(train)
            future <- forecast(test, model=model,h=h)
            nfuture <- as.numeric(future$mean) # EDITED: `future` instead of `out`
            RMSE <- rmse(test, nfuture) # EDITED: `nfuture` instead of `nn`
            singleblock[i] <- RMSEi
        }
        RMSEblk[b]<-mean(singleblock)
     }
    

    It is possible that these typos did not result in errors because nn and out were defined in the global environment while you ran the for loop. A good debugging tip is to restart R and try to reproduce the problem.