Search code examples
rforeachparallel-processingadaboost

Using foreach for parallel boosting in R


I routinely use the foreach package for training random forests in R, and I'm trying to find a rough equivalent for training adaboost models, but I'm running into the problem of how to combine the results. The randomForest package has the 'combine' function which allows for combining multiple randomForest objects into a single RF object, are there any boosting packages that feature a similar function? I typically use package adabag, but I can't figure out how to combine the outputted models (or if there even is a way). Has anyone tried this and figured out a solution? This snippet works to create the models in parallel:

library(foreach)
library(adabag)
library(doMC)
library(rpart)

registerDoMC(4)

data(iris)

testADA <- foreach(mfinal = rep(5, 4), .combine = c, .packages = "adabag") %dopar% boosting(Species ~ ., data = iris, boos = TRUE, mfinal = mfinal, control = c(minsplit = 0, cp = 0.000001))

But then I just end up with a list of models rather than a single model and I can't figure out how to combine them.


Solution

  • You could use caret package. The examples could be find here - http://caret.r-forge.r-project.org/parallel.html.

    In your case it could look like that:

    library(caret)
    library(doMC)
    registerDoMC(cores = 4)
    model <- train(Species ~ ., data = iris, method = "ada")
    

    "doMC" package doesn't work on Windows. Alternatively, here is solution for Windows machines:

    library('doParallel')
    cl <- makeCluster(4) #number of cores
    registerDoParallel(cl)
    model <- train(Species ~ ., data = iris, method = "ada")
    stopCluster(cl)