Search code examples
rparallel-processingparallel-foreach

run a for loop in parallel in R


I have a for loop that is something like this:

for (i=1:150000) {
   tempMatrix = {}
   tempMatrix = functionThatDoesSomething() #calling a function
   finalMatrix =  cbind(finalMatrix, tempMatrix)

}

Could you tell me how to make this parallel ?

I tried this based on an example online, but am not sure if the syntax is correct. It also didn't increase the speed much.

finalMatrix = foreach(i=1:150000, .combine=cbind) %dopar%  {
   tempMatrix = {}
   tempMatrix = functionThatDoesSomething() #calling a function

   cbind(finalMatrix, tempMatrix)

}

Solution

  • Thanks for your feedback. I did look up parallel after I posted this question.

    Finally after a few tries, I got it running. I have added the code below in case it is useful to others

    library(foreach)
    library(doParallel)
    
    #setup parallel backend to use many processors
    cores=detectCores()
    cl <- makeCluster(cores[1]-1) #not to overload your computer
    registerDoParallel(cl)
    
    finalMatrix <- foreach(i=1:150000, .combine=cbind) %dopar% {
       tempMatrix = functionThatDoesSomething() #calling a function
       #do other things if you want
    
       tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix)
    }
    #stop cluster
    stopCluster(cl)
    

    Note - I must add a note that if the user allocates too many processes, then user may get this error: Error in serialize(data, node$con) : error writing to connection

    Note - If .combine in the foreach statement is rbind , then the final object returned would have been created by appending output of each loop row-wise.

    Hope this is useful for folks trying out parallel processing in R for the first time like me.

    References: http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/