Search code examples
pythonpython-3.xmultithreading

Python thread pools not working: pool.map freezes, pool.map_async returns a MapResult that's not iterable


I'm trying to use the multiprocessing.Pool() function as it seems most fit for my project. I've followed several tutorial articles on how it works and attempted their examples. Some that I'm referring to:

https://superfastpython.com/threadpool-python

https://superfastpython.com/multiprocessing-pool-map

https://www.delftstack.com/howto/python/python-pool-map-multiple-arguments

https://www.codesdope.com/blog/article/multiprocessing-using-pool-in-python

Somehow all examples provided there by seemingly experienced programmers are broken: Whenever I try to run them just as described, my application freezes or errors out. Based on what those resources suggested I compiled the following test:

import multiprocessing as mp

def double(i):
    return i * 2

def main():
    pool = mp.Pool()
    for result in pool.map(double, [1, 2, 3]):
        print(result)

main()

This causes the application to spawn a lot of processes and then freeze. None of those processes are even using any CPU as if they're stuck processing something: They just sit there idly forever. I tried using pool.map_async instead of pool.map out of curiosity, now I instead get the following error:

TypeError: 'MapResult' object is not iterable

Strange since none of those examples mentioned there should be a MapResult object that cannot be iterated. But one suggested using result.get() instead: I did that and the result is that even with map_async everything just freezes again.

pool = mp.Pool()
result = pool.map(double, [1, 2, 3])
print(result)

Removing the for loop in favor of printing the result directly has the exact same behavior: If I use pool.map everything freezes forever, if I use pool.map_async I'm told it's a MapResult unless I try to print result.get() in which case everything freezes again. Exact same two issues exist when getting the data with either pool.apply(double) or pool.apply_async(double) instead. Any idea what both myself seemingly every tutorial on Python thread pools are doing wrong?


Solution

  • Code that is at top-level is run my every process. You need to make sure that your main() is only by the main thread.

    The simple solution is:

    if __name__ == '__main__':
         main()