Search code examples
pythonpython-3.9

Why does this global not stay updated when we return back to main, unless I add a new entrypoint to my program?


m1 is a global variable in collect_finetuning.py:

m1: list = []

When I run collect_finetuning.py via:

if __name__ == "__main__":
    main()

We see this behavior, where a function in collect_finetuning updates m1, and it loses the updated value in main():

enter image description here

enter image description here

I was able to get the same function in the second picture (main) to contain the now updated version of m1, as intended, by simply running this script instead of directly running collect_finetuning.py:

from scripts.collect_finetuning import main

main()

If I don't do this, when I return back to main, after invoke(), I get: print("m1 was:", m1) []

I am so confused as to why this worked (m1 stays updated and I can continue to use its updated value when we return back to main(), and why running collect_finetuning.py directly doesn't retain the updated value of m1. Please help me understand why this worked, thank you!


Solution

  • It sounds likely that you're loading the same file (collect_finetuning.py) under two different names in Python. When you run it as a script, it gets loaded as __main__. If another module imports it later on, it gets loaded as its name (collect_finetuning).

    These will be two separate module objects, each with their own global namespace, even though they were both loaded from the same file. If you change a variable in one of the global namespaces, the other one will not be changed. The call stack you're looking at in your IDE may be misleading you, because it only lists the filename of the code that's being run. It doesn't tell you which module object is being used.

    To see if this is what is happening, look into the sys.modules dictionary. You will likely see '__main__' and 'collect_finetuning' (with maybe a package prefix) listed as separate keys with separate module values that list the same filename as their source. That's the cause of the issue.

    As for how to fix it, the simple answer is to avoid using a library module (that will be imported in other code) as a script. Your helper script is one way to achieve that (making the script a trivial separate file that imports the real code from the library). An alternative is to move the parts of the file that are used by other code into a separate module, while main() and other things that are only needed when running as a script stay in the current file.