Search code examples
pythonpython-3.xpython-multiprocessingcircular-referencefinalizer

Why isn't the __del__ method called?


In python3's multiprocess, can't call the __del__ method.

I've read other issues about circular references,but I can't find the situation in multiprocess.

There is a circular reference in foo, __del__ will be called when foo is called directly,but in multiprocess the __del__ will never be called.

import multiprocessing
import weakref


class Foo():
    def __init__(self):
        self.b = []

    def __del__(self):
        print ('del')


def foo():
    print ('call foo')
    f = Foo()
    a = [f]
    # a = [weakref.ref(f)]
    f.b.append(a)


# call foo in other process
p = multiprocessing.Process(target=foo)
p.start()
p.join()


# call foo
foo()

Output:
call foo
call foo
del

why __del__ is not called in p?


Solution

  • Forked Process objects terminate after running their task using os._exit(), which forcibly terminates the child process without the normal cleanup Python performs on exit. Cyclic garbage isn't cleaned (because the process is terminated without giving the cyclic GC a chance to run), it's just dropped on the floor, leaving the OS to clean up.

    This is intentional, since exiting normally (invoking all normal cleanup procedures) would risk stuff like unflushed buffers getting flushed in both parent and child (doubling output), and other weirdness involved when a forked process inherits all the state of the parent but isn't supposed to use it except when told to do so explicitly.

    You could write a wrapper function that would invoke the "real" function, then trigger a cycle collection before returning, but it's hard to write correctly and quite brittle. An initial stab at it would be something like:

    import gc
    import traceback
    
    def clear_cycles_after(func, *args, **kwargs):
        try:
            return func(*args, **kwargs)
        except BaseException as e:
            # Clear locals of all frames in the traceback
            traceback.clear_frames(e.__traceback__)  # Requires 3.4+
            raise  # Reraises original exception with locals cleaned from all frames
        finally:
            gc.collect()  # Now that we've cleaned the locals from any exception traceback,
                          # it should be possible to identify cyclic garbage
                          # and gc.collect() will clean it up
    

    You'd use it by replacing:

    p = multiprocessing.Process(target=foo)
    

    with:

    p = multiprocessing.Process(target=clear_cycles_after, args=(foo,))
    

    I don't really recommend this solution though. Ideally, if some cleanup (not related to process memory, which the OS cleans for you anyway) must occur in the child, you'd implement the context manager protocol on the relevant type(s) (contextlib.contextmanager can be used to provide such functionality for existing types you can't modify directly) and create/control them with with statements, which would guarantee cleanup was performed deterministically, even in the presence of cyclic references, even on non-CPython interpreters (which aren't reference counted, and therefore don't perform deterministic cleanup without with statements even when there are no cyclic references). Anything less than with statements (or try/finally blocks with equivalent effect) is going to be some combination of brittle, non-portable, or non-functional.

    Using context management, your class and function would look like:

    class Foo:
        def __init__(self):
            self.b = []
    
        def close(self):  # Convenient to have a way to manually clean up when needed
            print('del')
            del self.b[:]  # Clear contents of b to avoid cyclic references after cleanup
    
        # Optional: Provide __del__ as best effort in case user doesn't close or context managev
        __del__ = close
        
        # Define context management special methods in terms of shared close
        def __enter__(self):
            return self  # No-op when entering with block
        def __exit__(self, typ, exc, tb):
            self.close()
    
    
    def foo():
        print ('call foo')
        with Foo() as f:  # Create and manage with with statement
            a = [f]
            f.b.append(a)
        # f's contents are cleaned here, so when foo returns, on CPython, a and f will be cleaned
        # since they're not part of a reference cycle anymore, and the actual Foo
        # object bound to f and a[0] will be removed deterministically
    

    You'll actually see multiple del outputs now in some cases, particularly when you include the optional __del__ = close line as a backup when the user fails to context manage (where close and/or __exit__ gets invoked, then __del__ gets invoked later), but there's no harm there (the contained list just gets emptied twice).