I am calling a python script on multiple cores using the snow package in R. What I want is to print the progress to the console. Using cat()
, message()
or print()
inside my function is not giving any output. This makes it difficult to track the progress of my function.
Is it possible to print output to the command line within the clusterApply or clusterMap functions?
This is my current script:
library(snow)
library(rlecuyer)
# Files to process
filenames=1:10
# Process function
processfunc=function(filename,filenames){
len_names=length(filenames) #Length of filenames
index = match(filename, filenames) #Index of current file
cat(paste('Processing input files:',format(round(index/len_names*100,2),nsmall=2),'% At:',filename)) # print progress
# system(paste('python','D:/pythonscript.py',filename))
}
corenr=7
cl = makeCluster(rep('localhost', corenr), 'SOCK')
clusterExport(cl, list("processfunc"))
clusterEvalQ(cl, library(stringr))
clusterSetupRNG(cl)
clusterMap(cl,function(x,filenames) processfunc(x,filenames),filenames,MoreArgs = list(filenames=filenames))
stopCluster(cl)
If you run it via a terminal, cmd or powershell you can add an extra system
or shell
call which prints your string. For example: shell(paste('echo', 'your string'))
.
Working example
library(snow)
library(rlecuyer)
# Files to process
filenames=1:10
# Process function
processfunc=function(filename,filenames){
len_names=length(filenames) #Length of filenames
index = match(filename, filenames) #Index of current file
shell(paste('echo', paste('Processing input files:',format(round(index/len_names*100,2),nsmall=2),'% At:',filename)))
# system(paste('python','D:/pythonscript.py',filename))
}
corenr=7
cl = makeCluster(rep('localhost', corenr), 'SOCK')
clusterExport(cl, list("processfunc"))
clusterEvalQ(cl, library(stringr))
clusterSetupRNG(cl)
clusterMap(cl,function(x,filenames) processfunc(x,filenames),filenames,MoreArgs = list(filenames=filenames))
stopCluster(cl)