Search code examples
pythonlocal-variablesreference-counting

When is the reference count for a local variable in a python function decreased?


I have the following function:

def myfn():
    big_obj = BigObj()
    result = consume(big_obj)
    return result

When is the reference count for the value of BigObj() increased / decreased: Is it:

  1. when consume(big_obj) is called (since big_obj is not referenced afterwards in myfn)
  2. when the function returns
  3. some point, I don't no yet

Would it make a difference to change the last line to:

return consume(big_obj)

Edit (clarification for comments):

  • A local variable exists until the function returns
  • the reference can be deleted with del obj

But what is with temporaries (e.g f1(f2())?

I checked references to temporaries with this code:

import sys
def f2(c):
    print("f2: References to c", sys.getrefcount(c))


def f0():
    print("f0")
    f2(object())

def f1():
    c = object()
    print("f1: References to c", sys.getrefcount(c))
    f2(c)

f0()
f1()

This prints:

f0
f2: References to c 3
f1: References to c 2
f2: References to c 4

It seems, that references to temporary variables are held. Not that getrefcount gives one more than you would expect because it holds a reference, too.


Solution

  • Disclaimer: Most information is from the comments. So credit for every one who participated in the discussion.

    When an object is deleted is an implementation detail in general. I will refer to CPython, which is based on reference counting. I ran the code examples with CPython 3.10.0.

    • An object is deleted, when the reference count hits zero.
    • Returning from a function deletes all local references.
    • Assigning a name to a new value decreases the reference count of the old value
    • passing a local increases the reference count. The reference is in on the stack(frame)
    • Returning from a function removes the reference from the stack

    The last point is even valid for temporary references like f(g()). The last reference to g() is deleted, when f returns (assuming that g does not save a reference somewhere)see here

    So for the example from the question:

    def myfn():
        big_obj = BigObj() # reference 1                     
        result = consume(big_obj) # reference 2 on the stack frame for  
                                  # consume. Not yet counting any 
                                  # reference inside of consume
                                  # after consume returns: The stack frame 
                                  # and reference 2 are deleted. Reference  
                                  # 1 remains
        return result             # myfn returns reference 1 is deleted. 
                                  # BigObj is deleted
    def consume(big_obj):
        pass # consume is holding reference 3
    

    If we would change this to:

    def myfn():
        return consume(BigObj()) # reference is still saved on the stack 
                                 # frame and will be only deleted after  
                                 # consume returns
    def consume(big_obj):
        pass # consume is holding reference 2
    

    How can I check reliably, if an object was deleted?

    You cannot rely on gc.get_objects(). gc is used to detect and recycle reference cycles. Not every reference is tracked by the gc. You can create a weak reference and check if the reference is still valid.

    class BigObj:
        pass
    
    import weakref
    ref = None
    
    def make_ref(obj):
        global ref
        ref = weakref.ref(obj)
        return obj
    
    def myfn():
        return consume(make_ref(BigObj()))
    
    def consume(obj):
        obj = None # remove to see impact on ref count
        print(sys.getrefcount(ref()))
        print(ref()) # There is still a valid reference. It is the one from consume stack frame
    

    myfn()

    How to pass a reference to a function and remove all references in the calling function?

    You can box the reference, pass to the function and clear the boxed reference from inside the function:

    class Ref:
        def __init__(ref):
            self.ref = ref
        def clear():
            self.ref = None
    
    def f1(ref):
        r = ref.ref
        ref.clear()
    
    def f2():
        f1(Ref(object())