Search code examples
rcmdconsolemultiprocessingsnow

Print progress to windows cmd within clusterApply or clusterMap


I am calling a python script on multiple cores using the snow package in R. What I want is to print the progress to the console. Using cat(), message() or print() inside my function is not giving any output. This makes it difficult to track the progress of my function.

Is it possible to print output to the command line within the clusterApply or clusterMap functions?


This is my current script:

library(snow)
library(rlecuyer)

# Files to process
filenames=1:10

# Process function
processfunc=function(filename,filenames){
  len_names=length(filenames) #Length of filenames
  index = match(filename, filenames) #Index of current file
  cat(paste('Processing input files:',format(round(index/len_names*100,2),nsmall=2),'% At:',filename)) # print progress
  # system(paste('python','D:/pythonscript.py',filename))
}

corenr=7
cl = makeCluster(rep('localhost', corenr), 'SOCK')
clusterExport(cl, list("processfunc"))
clusterEvalQ(cl, library(stringr)) 
clusterSetupRNG(cl)
clusterMap(cl,function(x,filenames) processfunc(x,filenames),filenames,MoreArgs = list(filenames=filenames))
stopCluster(cl)

Solution

  • If you run it via a terminal, cmd or powershell you can add an extra system or shell call which prints your string. For example: shell(paste('echo', 'your string')).

    Working example

    library(snow)
    library(rlecuyer)
    
    # Files to process
    filenames=1:10
    # Process function
    processfunc=function(filename,filenames){
      len_names=length(filenames) #Length of filenames
      index = match(filename, filenames) #Index of current file
      shell(paste('echo', paste('Processing input files:',format(round(index/len_names*100,2),nsmall=2),'% At:',filename)))
      # system(paste('python','D:/pythonscript.py',filename))
    }
    
    corenr=7
    cl = makeCluster(rep('localhost', corenr), 'SOCK')
    clusterExport(cl, list("processfunc"))
    clusterEvalQ(cl, library(stringr)) 
    clusterSetupRNG(cl)
    clusterMap(cl,function(x,filenames) processfunc(x,filenames),filenames,MoreArgs = list(filenames=filenames))
    stopCluster(cl)