Search code examples
pythonlambdapicklepython-multiprocessing

Replace pickle in Python multiprocessing lib


I need to execute the code below (simplified version of my real code base in Python 3.5):

import multiprocessing
def forever(do_something=None):
    while True:
        do_something()

p = multiprocessing.Process(target=forever, args=(lambda: print("do  something"),))
p.start()

In order to create the new process Python need to pickle the function and the lambda passed as target. Unofrtunately pickle cannot serialize lambdas and the output is like this:

_pickle.PicklingError: Can't pickle <function <lambda> at 0x00C0D4B0>: attribute lookup <lambda> on __main__ failed

I discoverd cloudpickle which can serialize and deserialize lambdas and closures, using the same interface of pickle.

How can I force the Python multiprocessing module to use cloudpickle instead of pickle?

Clearly hacking the code of the standard lib multiprocessing is not an option!

Thanks

Charlie


Solution

  • Try multiprocess. It's a fork of multiprocessing that uses the dill serializer instead of pickle -- there are no other changes in the fork.

    I'm the author. I encountered the same problem as you several years ago, and ultimately I decided that that hacking the standard library was my only choice, as some of the pickle code in multiprocessing is in C++.

    >>> import multiprocess as mp
    >>> p = mp.Pool()
    >>> p.map(lambda x:x**2, range(4))
    [0, 1, 4, 9]
    >>>