Search code examples
pythongarbage-collectionweak-referencescircular-reference

Preserving circular references after garbage collection


import weakref
import gc

class MyClass(object):
    def refer_to(self, thing):
        self.refers_to = thing

foo = MyClass()
bar = MyClass()
foo.refer_to(bar)
bar.refer_to(foo)
foo_ref = weakref.ref(foo)
bar_ref = weakref.ref(bar)
del foo
del bar
gc.collect()
print foo_ref()

I want foo_ref and bar_ref to retain weak references to foo and bar respectively as long as they reference each other*, but this instead prints None. How can I prevent the garbage collector from collecting certain objects within reference cycles?

bar should be garbage-collected in this code because it is no longer part of the foo-bar reference cycle:

baz = MyClass()
baz.refer_to(foo)
foo.refer_to(baz)
gc.collect()

* I realize it might seem pointless to prevent circular references from being garbage-collected, but my use case requires it. I have a bunch of objects that refer to each other in a web-like fashion, along with a WeakValueDictionary that keeps a weak reference to each object in the bunch. I only want an object in the bunch to be garbage-collected when it is orphaned, i.e. when no other objects in the bunch refer to it.


Solution

  • Normally using weak references means that you cannot prevent objects from being garbage collected.

    However, there is a trick you can use to prevent objects part of a reference cycle from being garbage collected: define a __del__() method on these.

    From the gc module documentation:

    gc.garbage

    A list of objects which the collector found to be unreachable but could not be freed (uncollectable objects). By default, this list contains only objects with __del__() methods. Objects that have __del__() methods and are part of a reference cycle cause the entire reference cycle to be uncollectable, including objects not necessarily in the cycle but reachable only from it. Python doesn’t collect such cycles automatically because, in general, it isn’t possible for Python to guess a safe order in which to run the __del__() methods. If you know a safe order, you can force the issue by examining the garbage list, and explicitly breaking cycles due to your objects within the list. Note that these objects are kept alive even so by virtue of being in the garbage list, so they should be removed from garbage too. For example, after breaking cycles, do del gc.garbage[:] to empty the list. It’s generally better to avoid the issue by not creating cycles containing objects with __del__() methods, and garbage can be examined in that case to verify that no such cycles are being created.

    When you define MyClass as follows:

    class MyClass(object):
        def refer_to(self, thing):
            self.refers_to = thing
        def __del__(self):
            print 'Being deleted now, bye-bye!'
    

    then your example script prints:

    <__main__.MyClass object at 0x108476a50>
    

    but commenting out one of the .refer_to() calls results in:

    Being deleted now, bye-bye!
    Being deleted now, bye-bye!
    None
    

    In other words, by simply having defined a __del__() method, we prevented the reference cycle from being garbage collected, but any orphaned objects are being deleted.

    Note that in order for this to work, you need circular references; any object in your object graph that is not part of a reference circle will be picked off regardless.