I'm trying to learn the joblib
module as an alternative to the builtin multiprocessing
module in python. I'm used to using multiprocessing.imap
to run a function over an iterable and returning the results as they come in. In this minimal working example, I can't figure out how to do it with joblib:
import joblib, time
def hello(n):
time.sleep(1)
print "Inside function", n
return n
with joblib.Parallel(n_jobs=1) as MP:
func = joblib.delayed(hello)
for x in MP(func(x) for x in range(3)):
print "Outside function", x
Which prints:
Inside function 0
Inside function 1
Inside function 2
Outside function 0
Outside function 1
Outside function 2
I'd like to see the output:
Inside function 0
Outside function 0
Inside function 1
Outside function 1
Inside function 2
Outside function 2
Or something similar, indicating that the iterable MP(...)
is not waiting for all the results to complete. For longer demo change n_jobs=-1
and range(100)
.
To get Immediate results from joblib, for instance:
from joblib._parallel_backends import MultiprocessingBackend
class ImmediateResult_Backend(MultiprocessingBackend):
def callback(self, result):
print("\tImmediateResult function %s" % (result))
# Overload apply_async and set callback=self.callback
def apply_async(self, func, callback=None):
applyResult = super().apply_async(func, self.callback)
return applyResult
joblib.register_parallel_backend('custom', ImmediateResult_Backend, make_default=True)
with joblib.Parallel(n_jobs=2) as parallel:
func = parallel(delayed(hello)(y) for y in range(3))
for f in func:
print("Outside function %s" % (f))
Output:
Note: I use time.sleep(n * random.randrange(1,5))
in def hello(...)
, therefore processes
become different ready.
Inside function 0
Inside function 1
ImmediateResult function [0]
Inside function 2
ImmediateResult function [1]
ImmediateResult function [2]
Outside function 0
Outside function 1
Outside function 2
Tested with Python:3.4.2 - joblib:0.11