Search code examples
rparallel-processingdoparallelsnow

R: doSNOW/foreach create list of list


Hi I would like to create a named list of list using the doSNOW/foreach package. For example the end product would be a list object. dfe named from a vector say,

n=c("n1","n2","n3","n4","n5")

so that I can access the list of list object like dfe[["n1"]]$a where a is a element in the list.

Here is an example of what I'm talking about.

mainStart <- Sys.time()

n=c("n1","n2","n3","n4","n5")

cores=detectCores() 
cl <- parallel::makeCluster(cores[1]-1) #not to overload your computer
registerDoSNOW(cl)

## setup progress bar 
pb <- txtProgressBar(max = 5, style = 3)
progress <- function(n) setTxtProgressBar(pb, n)
opts <- list(progress = progress)


dfe <-  foreach(id.this = n, .combine = list, .options.snow = opts) %dopar% {
    list ( a=c(1,2,3), b = c(1,2,3))
}

endTime <- Sys.time()
endTime -mainStart 


close(pb)
stopCluster(cl)

So it would be great if I the list that was created in the foreach loop could be name and access after the loop. Such that dfe[["n1"]]$a can give me the vector 1,2,3.


Solution

  • As suggested above it easier to just to setNames ( dfe, n) however I did not think that would work since some processes might take longer than other but it appears that the order does not change. For example when I set

    if ( id.this == "n2"){
            Sys.sleep(10)
        }
    

    the order was still retain. So final code would be something like this.

    mainStart <- Sys.time()
    
    n=c("n1","n2","n3","n4","n5")
    
    cores=detectCores() 
    cl <- parallel::makeCluster(cores[1]-1) #not to overload your computer
    registerDoSNOW(cl)
    
    ## setup progress bar 
    pb <- txtProgressBar(max = 4, style = 3)
    progress <- function(n) setTxtProgressBar(pb, n)
    opts <- list(progress = progress)
    
    
    dfe <-  foreach(id.this = n, .options.snow = opts) %dopar% {
        #list(id.this = list(  a=c(1,2,3), b = c(1,2,3) ) )
        if ( id.this == "n2"){
            Sys.sleep(10)
        }
        list(  a=c(id.this,2,3), b = c(1,2,3) )
    }
    
    endTime <- Sys.time()
    endTime -mainStart 
    
    
    close(pb)
    stopCluster(cl)
    
    dfe=setNames(dfe, n)