Search code examples
python-3.xmultiprocessingpicklepython-multiprocessing

Using multiprocessing.Pool in Python with a function returning custom object


I am using multiprocessing.Pool to speed up computation, as I call one function multiple times, and then collate the result. Here is a snippet of my code:

import multiprocessing
from functools import partial

def Foo(id:int,constant_arg1:str, constant_arg2:str):
    custom_class_obj = CustomClass(constant_arg1, constant_arg2)
    custom_class_obj.run() # this changes some attributes of the custom_class_obj
    
    if(something):
       return None
    else:
       return [custom_class_obj]



def parallel_run(iters:int, a:str, b:str):
  pool = multiprocessing.Pool(processes=k)

  ## create the partial function obj before passing it to pool
  partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)

  ## create the variable id list
  iter_list = list(range(iters))
  all_runs = pool.map(partial_func, iter_list)
 
  return all_runs

This throws the following error in the multiprocessing module:

multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'

How can I resolve this?


Solution

  • I was able to replicate the error message with a minimal example of an un-picklable class. The error basically states the instance of your class can't be pickled because it contains a reference to a module, and modules are not picklable. You need to comb through CustomClass to make sure instances don't hold things like open file handles, module references, etc.. If you need to have those things, you should use __getstate__ and __setstate__ to customize the pickle and unpickle process.

    distilled example of your error:

    from multiprocessing import Pool
    from functools import partial
    
    class klass:
        def __init__(self, a):
            self.value = a
            import os
            self.module = os #this fails: can't pickle a module and send it back to main process
    
    def foo(a, b, c):
        return klass(a+b+c)
    
    if __name__ == "__main__":
        with Pool() as p:
            a = 1
            b = 2
            bar = partial(foo, a, b)
            res = p.map(bar, range(10))
        print([r.value for r in res])