I am using Parallel function from joblib package in Python. I would like to use this function only for handle one of my functions but unfortunately the whole code is running in parallel (apart from other functions).
Example:
from joblib import Parallel, delayed
print ('I do not want this to be printed n times')
def do_something(arg):
some calculations(arg)
Parallel(n_jobs=5)(delayed(do_something)(i) for i in range(0, n))
This is a common error to miss a design direction from documentation. Many users meet this very same piece of experience.
Documentation is quite clear about not placing any code but def
-s before a __main__
fuse.
If not doing so, errors indeed spray out and things turn wreck havoc, but still, an explicit advice to re-read the documentation is still present there, leaking infinitely over the screen:
[joblib] Attempting to do parallel computing
without protecting your import on a system that does not support forking.
To use parallel-computing in a script, you must protect your main loop
using "if __name__ == '__main__'".
Please see the joblib documentation on Parallel for more information
Having properly done the first issue, reported w.r.t. the fused import
protection, things will get better:
C:\Python27.anaconda>python joblib_example.py
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
I do not want this to be printed n-times...
next a final touch and you are done:
from sklearn.externals.joblib import Parallel, delayed
def do_some_thing( arg ):
pass
return True
if __name__ == '__main__': #################################### A __main__ FUSE:
pass; n = 6
print "I do not want this to be printed n-times..."
Parallel( n_jobs = 5 ) ( delayed( do_some_thing )( i )
for i in range( 0, n )
)
C:\Python27.anaconda>python joblib_example.py
I do not want this to be printed n-times...
C:\Python27.anaconda>