Search code examples
pythonmultiprocessingconcurrent.futures

ProcessPoolExecutor does not start


I am working in a Jupyter notebook. I'm new to multiprocessing in python, and I'm trying to parallelize the calculation of a function for a grid of parameters. Here is a snippet of code quite representative of what I'm doing:

import os
import numpy as np
from concurrent.futures import ProcessPoolExecutor

def f(x,y):
    print(os.getpid(), x,y,x+y)
    return x+y

xs = np.linspace(5,7,3).astype(int)
ys = np.linspace(1,3,3).astype(int)

func = lambda p: f(*p)
with ProcessPoolExecutor() as executor:
    args = (arg for arg in zip(xs,ys))
    results = executor.map(func, args)
    
for res in results:
    print(res)

The executor doesn't even start.

No problem whatsoever if I serially execute the same with, e.g. list comprehension,

args = (arg for arg in zip(xs,ys))
results = [func(arg) for arg in args]

Solution

  • Are you running on Windows? I think your main problem is that each process is trying to re-execute your whole script, so you should include an if name == "main" check. I think you have a second issue trying to use a lambda function that can't be pickled, since the processes communicate by pickling the data. There are work-arounds for that but in this case it looks like you don't really need the lambda. Try something like this:

    import os
    import numpy as np
    from concurrent.futures import ProcessPoolExecutor
    
    
    def f(x, y):
        print(os.getpid(), x, y, x + y)
        return x + y
    
    if __name__ == '__main__':
    
        xs = np.linspace(5, 7, 3).astype(int)
        ys = np.linspace(1, 3, 3).astype(int)
    
        with ProcessPoolExecutor() as executor:
            results = executor.map(f, xs, ys)
    
        for res in results:
            print(res)