Search code examples
rloggingparallel-processingsnowfall

How to log using futile logger from within a parallel method in R?


I am using futile logger in R for logging. I have a parallel algorithm implemented using snowfall in R. Each core of the parallel process logs an intermediate output in the logger. But this output is not showing up in the logger?

Can we log using futile logger from within a parallel job using snowfall?

adding how it was done:

My specific case was a bit different. I am calling a C function from R using a shared object that I created. The function is an iterative algorithm and I need output to be logged every few iterations. I was interested in logging from the C function to futile logger. Why futile logger? Because this is part of a web-application and it makes sense to have all output for a user session to come in a consistent format.

This is the general approach I followed based on the accepted answer.

# init script
# iter logger namespace global variable
assign("MCMC_LOGGER_NAMESPACE", "iter.logger", envir = .GlobalEnv)  

loginit <- function(logfile) { 
  require('futile.logger')
  flog.layout(layout.simple, name = ITER_LOGGER_NAMESPACE)  
  flog.threshold(TRACE, name = ITER_LOGGER_NAMESPACE)
  flog.appender(appender.file(logfile), name = ITER_LOGGER_NAMESPACE)   
  NULL
}

parallel_funct_call_in_R <- function(required args) {    
require('snowfall')  
sfSetMaxCPUs() 
sfInit(parallel = TRUE, cpus = NUM_CPU) 
sfLibrary(required libs)
sfExport(required vars including logger namespace variable ITER_LOGGER_NAMESPACE)
iterLoggers = sprintf(file.path(myloggingdir, 'iterativeLogger_%02d.log', fsep = .Platform$file.sep), seq_len(NUM_CPU))
sfClusterApply(iterLoggers, loginit)  
sfSource(required files)
estimates <- sfLapply(list_to_apply_over, func_callling_C_from_R, required args)
sfStop()  
return(estimates)
}

iterTrackNumFromC <- function(numvec){
# convert numvec to json and log using flog.info
# the logger namespace has already been registered in the individual cores
flog.info("%s", toJSON(numvec), name = ITER_LOGGER_NAMESPACE) 
}

func_callling_C_from_R <- function(args){
 load shared obh using dyn.load
estimates = .C("C_func", args, list(iterTrackNumFromC)) # can use .Call also I guess
return(estimates)
}

Now the C function

void C_func(other args, char **R_loggerfunc){ // R_loggerfunc is passed iterTrackNumFromC    
// do stuff
// call function that logs numeric values to futile.logger
logNumericVecInR();
}

void logNumericVecInR (char *Rfunc_logger, double *NumVec, int len_NumVec){        
    long nargs = 1;        
    void *arguments[1];
    arguments[0] = (double*)NumVec;    
    char *modes[1];
    modes[0] = "double";        
    long lengths[1];
    lengths[0] = len_NumVec;        
    char *results[1];
    //    void call_R(char *func, long nargs, void **arguments, char **modes, long *lengths, char **names, long nres, char **results)    
    call_R(Rfunc_logger, nargs, arguments, modes, lengths, (char**)0, (long)1, results);
}

Hope this helps. If there is an easier way for R and C to share a common logger, please let me know.


Solution

  • A simple way to use the futile.logger package from a snowfall program is to use the sfInit slaveOutfile='' option so that worker output isn't redirected.

    library(snowfall)
    sfInit(parallel=TRUE, cpus=3, slaveOutfile='')
    sfLibrary(futile.logger)
    work <- function(i) {
      flog.info('Got task %d', i)
      i
    }
    sfLapply(1:10, work)
    sfStop()
    

    This is the snowfall interface to the snow makeCluster outfile='' option. It may not work properly with GUI interfaces such as Rgui, depending on how they handle process output, but it does work on Windows using Rterm.exe.

    I think that it's better to specify different log files for each worker. Here's an example of that:

    library(snowfall)
    nworkers <- 3
    sfInit(parallel=TRUE, cpus=nworkers)
    
    loginit <- function(logfile) {
      library(futile.logger)
      flog.appender(appender.file(logfile))
      NULL
    }
    sfClusterApply(sprintf('out_%02d.log', seq_len(nworkers)), loginit)
    
    work <- function(i) {
      flog.info('Got task %d', i)
      i
    }
    sfLapply(1:10, work)
    sfStop()
    

    This avoids all of the extra output coming from snow and puts each worker's log messages into a separate file which can be less confusing.