I am using multiprocessing.Pool to speed up computation, as I call one function multiple times, and then collate the result. Here is a snippet of my code:
import multiprocessing
from functools import partial
def Foo(id:int,constant_arg1:str, constant_arg2:str):
custom_class_obj = CustomClass(constant_arg1, constant_arg2)
custom_class_obj.run() # this changes some attributes of the custom_class_obj
if(something):
return None
else:
return [custom_class_obj]
def parallel_run(iters:int, a:str, b:str):
pool = multiprocessing.Pool(processes=k)
## create the partial function obj before passing it to pool
partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)
## create the variable id list
iter_list = list(range(iters))
all_runs = pool.map(partial_func, iter_list)
return all_runs
This throws the following error in the multiprocessing module:
multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'
How can I resolve this?
I was able to replicate the error message with a minimal example of an un-picklable class. The error basically states the instance of your class can't be pickled because it contains a reference to a module, and modules are not picklable. You need to comb through CustomClass
to make sure instances don't hold things like open file handles, module references, etc.. If you need to have those things, you should use __getstate__
and __setstate__
to customize the pickle and unpickle process.
distilled example of your error:
from multiprocessing import Pool
from functools import partial
class klass:
def __init__(self, a):
self.value = a
import os
self.module = os #this fails: can't pickle a module and send it back to main process
def foo(a, b, c):
return klass(a+b+c)
if __name__ == "__main__":
with Pool() as p:
a = 1
b = 2
bar = partial(foo, a, b)
res = p.map(bar, range(10))
print([r.value for r in res])