Search code examples
rparallel-processingtime-seriesforecasting

Training Multiple Auto.Arima Models in Parallel


In the code below I'm trying to train two different auto.arima models at the same time in parallel on different cores. I'm getting the error below when I try to run the code. I'm not sure if my issue is with do.call or parLapply, also I'm pretty new to parallel processing so any tips are very helpful.

Code:
library("forecast")
library("parallel")

TList2<-list(x=tsd1, lambda = Tlambda, stepwise=TRUE, approximation = TRUE)
DList2<-list(x=tsd2, lambda = Rlambda, stepwise=TRUE, approximation = TRUE)

##Parallelizing ARIMA Model Training

# Calculate the number of cores
no_cores <- 1

# Initiate cluster
cl <- makeCluster(no_cores)

ARIMA_List<-list(TList2,DList2)

ARIMA_Models<-parLapply(cl, ARIMA_List,
                    function(x){do.call(auto.arima, args=x)})   

stopCluster(cl)


Error:
Error in checkForRemoteErrors(val) : 
  one node produced an error: object 'auto.arima' not found

Data:

dput(TList2)
structure(list(x = c(6, 15.5, 22, 16, NA, NA, 13, 13.5, 10, 6, 
14.5, 16, NA, 8, 11, NA, 2, 2, 10, NA, 9, NA, 11, 16, NA, 4, 
17, 7, 11.5, 22, 20.5, 10, 22, NA, 13, 17, 22, 9, 13, 19, 8, 
16, 18, 22, 21, 14, 7, 20, 21.5, 17), lambda = 0.999958829041611, 
    stepwise = TRUE, approximation = TRUE), .Names = c("x", "lambda", 
"stepwise", "approximation"))

dput(DList2)
structure(list(x = c(11, 4, 8, 11, 11, NA, 3, 2.5, 6, 11, 7, 
1, NA, 6, 6, NA, 6, 11, 3, NA, 11, NA, 10, 10, NA, NA, 9, 3, 
3, 11, 8, 10, NA, NA, 11, 10, 9, 3, 7, NA, 2, 4, 11, 2.5, 3, 
NA, 4, 7, 1, 5), lambda = 0.170065851742339, stepwise = TRUE, 
    approximation = TRUE), .Names = c("x", "lambda", "stepwise", 
"approximation"))

Solution

  • I think forecast::auto.arima should be available on the clusters, too, so try for example using clusterEvalQ like this:

    TList2 <- structure(list(x = c(6, 15.5, 22, 16, NA, NA, 13, 13.5, 10, 6, 
    14.5, 16, NA, 8, 11, NA, 2, 2, 10, NA, 9, NA, 11, 16, NA, 4, 
    17, 7, 11.5, 22, 20.5, 10, 22, NA, 13, 17, 22, 9, 13, 19, 8, 
    16, 18, 22, 21, 14, 7, 20, 21.5, 17), lambda = 0.999958829041611, 
        stepwise = TRUE, approximation = TRUE), .Names = c("x", "lambda", 
    "stepwise", "approximation"))
    
    DList2<- structure(list(x = c(11, 4, 8, 11, 11, NA, 3, 2.5, 6, 11, 7, 
    1, NA, 6, 6, NA, 6, 11, 3, NA, 11, NA, 10, 10, NA, NA, 9, 3, 
    3, 11, 8, 10, NA, NA, 11, 10, 9, 3, 7, NA, 2, 4, 11, 2.5, 3, 
    NA, 4, 7, 1, 5), lambda = 0.170065851742339, stepwise = TRUE, 
        approximation = TRUE), .Names = c("x", "lambda", "stepwise", 
    "approximation"))
    
    library("forecast")
    library("parallel")
    cl <- makeCluster(no_cores)
    clusterEvalQ(cl, library(forecast))
    ARIMA_List<-list(TList2,DList2)
    ARIMA_Models<-parLapply(cl, ARIMA_List,
                        function(x){do.call(auto.arima, args=x)})   
    stopCluster(cl)