Search code examples
pythonreferencecycle

Circular reference in python


I am not sure how python deal with circular reference (reference cycle). I check some answers and found this:

Python's standard reference counting mechanism cannot free cycles, so the structure in your example would leak.
The supplemental garbage collection facility, however, is enabled by default and should be able to free that structure, if none of its components are reachable from the outside anymore and they do not have __del__() methods.

I guess it means that if none of the instances in reference cycle are reachable outside, both of them will be cleaned up. Is this true?
On the other hand, there is a package weakref which is often used to deal with map dictionary. The purpose of its existence, I suppose, is to avoid reference cycle.
In summary, can python deal with reference cycle automatically? If they can, why do we have to use weakref?


Solution

  • You do not have to worry about reference cycles if the objects in the cycle don't have a custom __del__ method, as Python can (and will) destroy the objects in any order.

    If your custom methods do have a __del__ method, Python doesn't know if one objects deletion effects another objects deletion at all. Say when an object is deleted, it sets some global variable. So, the objects stick around. As a quick test, you can make a __del__ method that prints something:

    class Deletor(str):
        def __del__(self):
            print(self, 'has been deleted')
    
    a = Deletor('a')  # refcount: 1
    del a             # refcount: 0
    

    Outputs:

    a has been deleted
    

    But if you had code like this:

    a = Deletor('a')  # refcount: 1
    a.circular = a    # refcount: 2
    del a             # refcount: 1
    

    It outputs nothing, as Python can't safely delete a. It becomes an "uncollectable object", and can be found in gc.garbage

    There are two solutions to this. The weakref (which doesn't increase the reference count):

    # refcount:             a  b
    a = Deletor('a')      # 1  0
    b = Deletor('b')      # 1  1
    b.a = a               # 2  1
    a.b = weakref.ref(b)  # 2  1
    del a                 # 1  1
    del b                 # 1  0
    # del b kills b.a     # 0  0
    

    Outputs:

    b has been deleted
    a has been deleted
    

    (Note how b is deleted first before it can delete a)

    And you can manually delete the cycles (If you can keep track of them):

    # refcount          a  b
    a = Deletor('a')  # 1  0
    b = Deletor('b')  # 1  1
    b.a = a           # 2  1
    a.b = b           # 2  2
    del b             # 2  1
    print('del b')
    del a.b           # 2  0
    # b is deleted, now dead
    # b.a now dead    # 1  0
    print('del a.b')
    del a             # 0  0
    print('del a')
    

    Outputs:

    del b
    b has been deleted
    del a.b
    a has been deleted
    del a
    

    Notice how b is deleted after a.b is deleted.


    † Starting with Python 3.4, things changed due to PEP 442. __del__ may be called even on objects in a reference cycle and the semantics are slightly different, so it is slightly harder to become an uncollectable object.

    A weakref is still helpful, as it is less intense on the garbage collector, and memory can be reclaimed earlier.