Search code examples
pythonpython-3.xpathos

What is the canonical way to use locking with `pathos.pools.ProcessPool`?


Let's consider the following example:

from pathos.pools import ProcessPool

class A:
    def run(self, arg: int):

        shared_variable = 100
        
        def __run_parallel(arg: int):
            local_variable = 0

            # ...

            if local_variable > shared_variable:
              shared_variable = local_variable

        ProcessPool(4).map(__run_parallel, range(1000))

It's quite obvious to see that there's a data race in if local_variable > shared_variable: with shared_variable = local_variable when using four processes.

Consequently, I'd like to introduce a locking mechanism around the if block, so I tried the following:

from pathos.pools import ProcessPool
from multiprocessing import Lock

class A:
    def run(self, arg: int):

        lock = Lock()
        shared_variable = 100
        
        def __run_parallel(arg: int):
            local_variable = 0

            # ...

            lock.acquire()
            if local_variable > shared_variable:
              shared_variable = local_variable
            lock.release()

        ProcessPool(4).map(__run_parallel, range(1000))

However, I get the error RuntimeError: Lock objects should only be shared between processes through inheritance.

In the multiprocessing library, it seems as if the canonical way to achieve the desired mutual exclusion would be to use a Manager object.

However, how to do this idiomatically in pathos?


Solution

  • pathos leverages multiprocess, which has the same interface as multiprocessing, but uses dill. You can access it either of these ways.

    >>> import pathos as pa
    >>> import multiprocess as mp
    >>> mp.Manager is pa.helpers.mp.Manager
    True