Search code examples
pythonpython-2.7python-multiprocessingprocess-pool

multiprocessing pool not working in nested functions


Following code not executing as expected.

import multiprocessing

lock = multiprocessing.Lock()
def dummy():
    def log_results_l1(results):
        lock.acquire()
        print("Writing results", results)
        lock.release()

    def mp_execute_instance_l1(cmd):
        print(cmd)
        return cmd

    cmds = [x for x in range(10)]

    pool = multiprocessing.Pool(processes=8)

    for c in cmds:
        pool.apply_async(mp_execute_instance_l1, args=(c, ), callback=log_results_l1)

    pool.close()
    pool.join()
    print("done")


dummy()

But it does work if the functions are not nested. What is going on.


Solution

  • multiprocessing.Pool methods like the apply* and *map* methods have to pickle both the function and the arguments. Functions are pickled by their qualified name; essentially, on unpickling, the other process needs to be able to import the module they were defined in and do a getattr call to find the function in question. Nested functions aren't available by name outside the function they were defined in, so pickling fails. When you move the function to global scope, you fix this, which is why it works when you do that.