Search code examples
pythonmultithreadinggarbage-collectionweak-references

Can an object be garbage collected if one of its members is a running thread?


I have a custom thread subclass that calls repeatedly a method on a "bound" object. The goal is to automatically join this thread whenever the "bound" object gets GC'ed:

from threading import Thread, Event
from weakref import ref, WeakSet
from typing import Callable, Generic
from typing_extensions import TypeVar
import atexit

T = TypeVar("T", bound=object)

_workers: WeakSet = WeakSet()

@atexit.register
def stopAllWorkers():
    # copy because _workers will be mutated during enumeration.
    refs = list(_workers)
    for worker in refs:
        try:
            worker.stop()
        except:
            pass


class MaintenanceWorker(Thread, Generic[T]):

    def __init__(self, bound_to: T, interval: float, task: Callable[[T], None]):
        self._interval = interval
        self._task = task
        self._ref = ref(bound_to, lambda _: self.stop())
        self._finished = Event()
        super().__init__()

    def stop(self):
        self._finished.set()
        _workers.discard(self)

    def run(self):
        self._finished.clear()
        _workers.add(self)
        while True:
            self._finished.wait(self._interval)
            if self._finished.is_set() or (subject := self._ref()) is None:
                _workers.discard(self)
                break
            try:
                self._task(subject)
            except Exception:
                pass
            finally:
                del subject

Will the instances of the following class be able to be garbage collected at all, as one of their members is a running thread?


class Foo: 

   def __init__(self):
     self._worker = MaintenaceWorker(bound_to=self, interval=15*60.0, Foo.bar)
     self._worker.start()

   def bar(self):
     # some convoluted logic 
     ... 

Solution

  • Members of a class have no effect on the "garbage collection", the reference count is applied per object basis.

    class A:
        def __del__(self):
            print("A destroyed")
    
    class B:
        def __init__(self, a):
            self.a = a
        def __del__(self):
            print("B destroyed")
    
    def worker(a):
        b = B(a)
        # B destroyed here
    
    def main():
        a = A()
        worker(a)
        print("worker returned")
        # A destroyed here
    
    main()
    

    output in CPython 3.12

    B destroyed
    worker returned
    A destroyed
    

    The fact that B contains a reference to A has no effect on the lifetime of B.

    Note that in your code since _task takes first argument of self then the object cannot be deleted while _task is running, but it can be deleted before or after it.

    lastly note that CPython deallocates object as soon as their reference count hits 0, the garbage collector is only responsible for cleaning cyclic references.