Search code examples
pythonpython-3.xmultiprocessingcontextmanager

What is the benefit of using a context mananger with multiprocessing.Manager?


In the documentation, Manager is used with a context manager (i.e. with) like so:

from multiprocessing.managers import BaseManager

class MathsClass:
    def add(self, x, y):
        return x + y
    def mul(self, x, y):
        return x * y

class MyManager(BaseManager):
    pass

MyManager.register('Maths', MathsClass)

if __name__ == '__main__':
    with MyManager() as manager:
        maths = manager.Maths()
        print(maths.add(4, 3))         # prints 7
        print(maths.mul(7, 8))         # prints 56

But what is the benefit of this, with the exception of the namespace? For opening file streams, the benefit is quite obvious in that you don't have to manually .close() the connection, but what is it for Manager? If you don't use it in a context, what steps do you have to use to ensure that everything is closed properly?

In short, what is the benefit of using the above over something like:

manager = MyManager()
maths = manager.Maths()
print(maths.add(4, 3))         # prints 7
print(maths.mul(7, 8))         # prints 56

Solution

  • But what is the benefit of this (...)?

    First, you get the primary benefit of almost any context managers. You have a well-defined lifetime for the resource. It is allocated and acquired when the with ...: block is opened. It is released when the blocks ends (either by reaching the end or because an exception is raised). It is still deallocated whenever the garbage collector gets around to it but this is of less concern since the external resource has already been released.

    In the case of multiprocessing.Manager (which is a function that returns a SyncManager, even though Manager looks lot like a class), the resource is a "server" process that holds state and a number of worker processes that share that state.

    what is [the benefit of using a context manager] for Manager?

    If you don't use a context manager and you don't call shutdown on the manager then the "server" process will continue running until the SyncManager's __del__ is run. In some cases, this might happen soon after the code that created the SyncManager is done (for example, if it is created inside a short function and the function returns normally and you're using CPython then the reference counting system will probably quickly notice the object is dead and call its __del__). In other cases, it might take longer (if an exception is raised and holds on to a reference to the manager then it will be kept alive until that exception is dealt with). In some bad cases, it might never happen at all (if SyncManager ends up in a reference cycle then its __del__ will prevent the cycle collector from collecting it at all; or your process might crash before __del__ is called). In all these cases, you're giving up control of when the extra Python processes created by SyncManager are cleaned up. These processes may represent non-trivial resource usage on your system. In really bad cases, if you create SyncManager in a loop, you may end up creating many of these that live at the same time and could easily consume huge quantities of resources.

    If you don't use it in a context, what steps do you have to use to ensure that everything is closed properly?

    You have to implement the context manager protocol yourself, as you would for any context manager you used without with. It's tricky to do in pure-Python while still being correct. Something like:

    manager = None
    try:
        manager = MyManager()
        manager.__enter__()
        # use it ...
    except:
        if manager is not None:
            manager.__exit__(*exc_info())
    else:
        if manager is not None:
            manager.__exit__(None, None, None)
    

    start and shutdown are also aliases of __enter__ and __exit__, respectively.