Search code examples
pythonmacosconcurrent.futurespebble

concurrent.futures misbehaves on Python 3.8+ on Mac OS


Fellow co-worker and I have run into an issue on our Macs (His: Intel, Mine: M1). I'm on 12.5.1 Monterey (not sure of his).

When using Python 3.7 and implementing the following code, all works as expected:

Toy Example

from concurrent.futures import ProcessPoolExecutor

def foo(a, b=0):
  return a + b

with ProcessPoolExecutor(max_workers=4) as executor:
  future = executor.submit(foo, 1, b=2)
  print(future.result())

# prints "3"

BUT when I use Python 3.8 - 3.10, I get an error trace that looks like:

Process SpawnProcess-1:
Traceback (most recent call last):
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/concurrent/futures/process.py", line 237, in _process_worker
    call_item = call_queue.get(block=True)
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/multiprocessing/queues.py", line 122, in get
    return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'foo' on <module '__main__' (built-in)>
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/user/.pyenv/versions/3.10.2/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

If we fire up a Docker python:3.10-slim and execute the same code on the Mac, it works great in the container.

Can't find any concrete question or evidence that others have run into this problem, but this toy example fails on both our Macs. Seems like it has troubles finding the definition of the foo function. Originally ran into this problem with Pebble, but have found it in the builtin library now.

Any history of problems with Mac Python 3.8+ and concurrent.futures?

More Detailed Example

It was pointed out that you can check for __main__ in the toy example above, so I am including another example, using Pebble, that works great everywhere, except Mac Python 3.8+ where it throws the same sort of error. This is how I use Pebble in my code, but breaks when I use the later Python, only on a Mac:

from pebble import concurrent


class Foo:
    def __init__(self, timeout):
        self.timeout = timeout

    def do_math(self, a, b):
        # Define our task function
        @concurrent.process(timeout=self.timeout)
        def bar(a, b=0):
            return a + b

        future = bar(a, b)
        return future.result()


if __name__ == "__main__":
    foo = Foo(timeout=5)
    print(foo.do_math(2, 3))
    # Prints 5, except on Mac Python 3.8+

Again, on Mac Python 3.8+ (only) it throws this error:

pebble.common.RemoteTraceback: Traceback (most recent call last):
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/concurrent/process.py", line 205, in _function_lookup
    return _registered_functions[name]
KeyError: 'Foo.do_math.<locals>.bar'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/common.py", line 174, in process_execute
    return function(*args, **kwargs)
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/concurrent/process.py", line 194, in _trampoline
    function = _function_lookup(name, module)
  File "/Users/user/Projects/temp/venv/lib/python3.10/site-packages/pebble/concurrent/process.py", line 209, in _function_lookup
    function = getattr(mod, name)
AttributeError: module '__mp_main__' has no attribute 'Foo.do_math.<locals>.bar'

Solution

  • Python 3.8 changed the default multiprocessing startmethod on Mac from fork to spawn, because forking was leading to crashes. (Fork-without-exec is just very precarious in general, and it can cause problems on non-Mac systems too, but Mac system frameworks in particular do not play well with forking.)

    Your code is unsafe to use with the spawn startmethod. In the first example, this is because you're missing an if __name__ == '__main__' guard. In the second example, it's because you're using a nested function, which cannot be loaded by the worker process.


    You need to make your code spawn-safe. Add if __name__ == '__main__' guards, stop trying to run nested functions in worker processes, and fix whatever else you might be doing that doesn't work with spawn.

    You could try passing a fork context to pebble:

    import multiprocessing
    
    @concurrent.process(timeout=self.timeout, context=multiprocessing.get_context('fork'))
    def bar(a, b=0):
        ...
    

    but there's a good reason the default was changed. Using fork on Mac is likely to lead to weird crashes. If you're lucky, it'll crash immediately. If you're unlucky, you'll get an urgent call at 3 in the morning on a Saturday 5 months from now, when you've forgotten all about this and you have to figure out the problem from scratch.