Search code examples
pythonparallel-processingjoblib

Parallel function from joblib running whole code apart from functions


I am using Parallel function from joblib package in Python. I would like to use this function only for handle one of my functions but unfortunately the whole code is running in parallel (apart from other functions).

Example:

from joblib import Parallel, delayed
print ('I do not want this to be printed n times')
def do_something(arg):
    some calculations(arg)

Parallel(n_jobs=5)(delayed(do_something)(i) for i in range(0, n))

Solution

  • This is a common error to miss a design direction from documentation. Many users meet this very same piece of experience.

    Documentation is quite clear about not placing any code but def-s before a __main__ fuse.

    If not doing so, errors indeed spray out and things turn wreck havoc, but still, an explicit advice to re-read the documentation is still present there, leaking infinitely over the screen:

    [joblib] Attempting to do parallel computing
    without protecting your import on a system that does not support forking.
    
    To use parallel-computing in a script, you must protect your main loop
    using "if __name__ == '__main__'".
    
    Please see the joblib documentation on Parallel for more information
    

    Solution:

    Having properly done the first issue, reported w.r.t. the fused import protection, things will get better:

    C:\Python27.anaconda>python joblib_example.py
    I do not want this to be printed n-times...
    I do not want this to be printed n-times...
    I do not want this to be printed n-times...
    I do not want this to be printed n-times...
    I do not want this to be printed n-times...
    I do not want this to be printed n-times...
    

    next a final touch and you are done:

    from sklearn.externals.joblib  import Parallel, delayed
    
    def do_some_thing( arg ):
        pass
        return True
    
    if  __name__ == '__main__': #################################### A __main__ FUSE:
    
        pass;                                   n = 6
        print "I do not want this to be printed n-times..."
    
        Parallel( n_jobs = 5 ) ( delayed( do_some_thing )( i )
                                                       for i in range( 0, n )
                                 )
    

    C:\Python27.anaconda>python joblib_example.py
    I do not want this to be printed n-times...
    
    C:\Python27.anaconda>