Been a longtime browser of SO, finally asking my own questions!
So, I am writing an automation script/module that looks through a directory recursively for python modules with a specific name. If I find a module with that name, I load it dynamically, pull what I need from it, and then unload it. I noticed though that simply del'ing the module does not remove all references to that module, there is another lingering somewhere and I do not know where it is. I tried taking a peek at the source code, but couldn't make sense of it too well. Here is a sample of what I am seeing, greatly simplified:
I am using Python 3.5.2 (Anaconda v4.2.0). I am using importlib, and that is what I want to stick with. I also want to be able to do this with vanilla python-3.
I got the import from source from the python docs here (yes I am aware this is the Python 3.6 docs).
My main driver...
# main.py
import importlib.util
import sys
def foo():
spec = importlib.util.spec_from_file_location('a', 'a.py')
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
print(sys.getrefcount(module))
del module
del spec
if __name__ == '__main__':
foo()
print('THE END')
And my sample module...
# a.py
print('hello from a')
class A():
def __del__(self):
print('SO LONG A!')
inst = A()
Output:
python main.py
HELLO FROM A!
2
THE END
SO LONG A!
I expected to see "SO LONG A!" printed before "THE END". So, where is this other hidden reference to my module? I understand that my del's are gratuitous with the fact that I have it wrapped in a function. I just wanted the deletion and scope to be explicit. How do I get a.py to completely unload? I plan on dynamically loading a ton of modules like a.py, and I do not want to hold on to them any longer than I really have to. Is there something I am missing?
There is a circular reference here, the module object references objects that reference the module again.
This means the module is not cleared immediately (as the reference count never goes to 0 by itself). You need to wait for the circle to be broken by the garbage collector.
You can force this by calling gc.collect()
:
import gc
# ...
if __name__ == '__main__':
foo()
gc.collect()
print('THE END')
With that in place, the output becomes:
$ python main.py
hello from a
2
SO LONG A!
THE END