Search code examples
pythonmultiprocessingmultiprocess

How to decalre multiprocess pool as a variable and import it to another python file


I have two python files.

File: multiprocess_pool.py


from multiprocessing import Pool

A = Pool(5)

File: main.py

from multiprocess_pool import A

def start(x):
    print(x+1)

if __name__ == '__main__':
    with A as a:
        print(a.apply_async(start, args=(1,)).get())

When I exectue python main.py, I get the next error.

AttributeError: Can't get attribute 'start' on <module '__main__' 

I don’t want to declare the Pool in main.py. Since my service is a web server, I need to call Pool to perform some CPU-intensive tasks. I’d like to declare Pool as a variable and import it into another file.


Solution

  • If you want to avoid needing to use __main__, or files at all, you can use multiprocess. It's a fork of multiprocessing that has better serialization but otherwise the same interface.

    >>> import multiprocess
    >>> def start(x):
    ...   print(x)
    ...   return x+1
    ... 
    >>> with multiprocess.Pool(4) as pool:
    ...   print(pool.apply_async(start, args=(1,)).get())
    ... 
    1
    2
    

    If you want to create a persistent Pool instance that can easily be shut down and restarted, you might try pathos, which is mostly a wrapper around multiprocess to (1) provide a map interface that is the same as python's builtin serial map, and (2) to provide more persistent Pool objects.

    >>> import pathos
    >>> pool = pathos.pools.ProcessPool(4)
    >>> print(pool.apipe(start, 1).get())
    1
    2
    >>> pool.close()
    >>> pool.join()
    >>> 
    >>> pool.restart()
    <multiprocess.pool.Pool state=RUN pool_size=4>
    >>> print(pool.apipe(start, 1).get())
    1
    2