Search code examples
pythondynamicgarbage-collectionclassloader

In python, how can you unload generated classes


I am working on a library that loads files (hfd5 - pytables) into an object structure. The actual classes being used for the structure is loaded as a string from the hdf5 file, and then loaded in this fashion:

class NamespaceHolder(dict):
    # stmt is the source code holding all the class defs
    def execute(self, stmt):
        exec stmt in self

The problem is, loading multiple classes like this, causes objects to appear in the uncollectible part of the garbage collection, namely the actual class definitions. I can also load this into a global dictionary, but the problem remains of orphaned classes. Is there any way to unload the classes?

The main problem is the class.mro attribute, which contains a reference back to the class itself, causing circular references that the garbage collector can't handle.

Here is a small test case to see for yourselves:

import gc

if __name__ == "__main__":
    gc.enable()
    gc.set_debug(gc.DEBUG_LEAK)

    code = """
class DummyA(object):
    pass
"""
    context = {}

    exec code in context
    exec code in context

    gc.collect()
    print len(gc.garbage)

Just a note: I have already argued against using parsing off text in a file for creating classes earlier, but apparently they are set on using it here and see some benefits I don't, so going away from this solution isn't feasible now.


Solution

  • The gc.set_debug(gc.DEBUG_LEAK) causes the leak. Try this:

    import gc
    
    def foo():                              
        code = """
    class DummyA(object):
        pass             
    """
        context = {}
        exec code in context
        exec code in context
    
        gc.collect()
        print len(gc.garbage), len(gc.get_objects())
    
    gc.enable()
    foo(); foo() # amount of objects doesn't increase
    gc.set_debug(gc.DEBUG_LEAK)
    foo() # leaks