Search code examples
pythonglobal

Python global variables in shared modules


I am trying to understand the behavior of variables shared across modules. My impression was a global variable in one module could be accessed in another, and sometimes this seems to be what happens, but sometimes it is not. (I recognize global variables are usually not the right tool, but there are situations where their use is appropriate.)

Here is a code example to illustrate my confusion:

shared.py

globalvar=0                 # a global scalar variable
globalarray1=[1,2,3]        # a global array modified by .append()
globalarray2=['A','B','C']  # a global array modified by reassignment

def set_globals():
    global globalvar
    global globalarray1
    global globalarray2

    globalvar=1
    globalarray1.append(4)
    globalarray2=['a','b','c']

    print("set_globals: globalvar="+str(globalvar))
    print("set_globals: globalarray1="+str(globalarray1))
    print("set_globals: globalarray2="+str(globalarray2))

module.py

from paas.shared import *

def _main_():
    global globalvar
    global globalarray1
    global globalarray2

    set_globals()

    print("main: globalvar="+str(globalvar))
    print("main: globalarray1="+str(globalarray1))
    print("main: globalarray2="+str(globalarray2))

_main_()

When I run the program, I get the following output:

set_globals: globalvar=1
set_globals: globalarray1=[1, 2, 3, 4]
set_globals: globalarray2=['a', 'b', 'c']
main: globalvar=0
main: globalarray1=[1, 2, 3, 4]
main: globalarray2=['A', 'B', 'C']

So

  • 'main' is not seeing the same scalar variable 'globalvar' that is being set by the 'set_globals' method in the shared module (the assignment 'took' and was printed within 'set_globals' but 'main' is printing something else.
  • It does appear to be seeing the same 'globalarray1' - the append(4) that modified globalarray1 in set_globals is recognized when it is printed by 'main'.
  • It is not seeing the same 'globalarray2' - the variable assigned in set_globals is correctly printed by 'set_globals', but 'main' is printing the original value assigned to globalarray2.

Can someone explain what is going on? I have only seen fairly simple tutorial-type documentation about global variables; if there is documentation that describes what is happening here I would appreciate a pointer.

ADDED NOTE: This is not related to the way a local variable in a function hides a global variable with the same name; those are two different variables with the same name and different initial values. It does appear that the import creates a new module variable with the same name, but it starts with the initial value of the variable of the same name from the imported module.


Solution

  • What you're observing stems from Python's behavior with respect to variables, imports, and mutability. Let's break it down:

    1. Shared Mutable Objects (Lists in this case):
    • globalarray1 is a list, which is a mutable object in Python. When you do globalarray1.append(4), you are mutating the original list object.
    • When you import globalarray1 in module.py using from paas.shared import *, you are importing a reference to that list object. Therefore, when you mutate the list in shared.py, that mutation is visible in module.py.
    1. Reassigning Variables:
    • globalarray2=['a','b','c'] inside set_globals reassigns the variable to a new list object. However, module.py still references the original list object.
    • For the scalar globalvar, when you set globalvar=1 inside set_globals, you're modifying the value of the variable within the scope of shared.py. However, in module.py, the globalvar still references the old value (0).
    1. Python's Import Mechanism:
    • When you use the syntax from paas.shared import *, you're importing the names from shared.py into module.py. If you later rebind one of these names in shared.py (as you do with globalvar and globalarray2), the binding in module.py remains unchanged.

    How to fix this and see consistent behavior:

    You should access variables from the shared module using an explicit module prefix. This way, you're always referring to the most recent value of the variable in the shared module. Here's how you can change your module.py, let's call it module2.py:

    import paas.shared as shared
    
    def _main_():
        shared.set_globals()
    
        print("main: globalvar="+str(shared.globalvar))
        print("main: globalarray1="+str(shared.globalarray1))
        print("main: globalarray2="+str(shared.globalarray2))
    
    _main_()
    

    With this change, the print statements in _main_() will refer directly to the shared module's versions of the variables, and you'll see consistent behavior between set_globals and _main_().

    Now let's compare the output. Here's module.py (unmodified from above):

    $ python module.py
    
       set_globals: globalvar=1
       set_globals: globalarray1=[1, 2, 3, 4]
       set_globals: globalarray2=['a', 'b', 'c']
       main: globalvar=0
       main: globalarray1=[1, 2, 3, 4]
       main: globalarray2=['A', 'B', 'C']
    

    And now the new code using an explicit import (module2.py):

    $ python module2.py  
                            
       set_globals: globalvar=1
       set_globals: globalarray1=[1, 2, 3, 4]
       set_globals: globalarray2=['a', 'b', 'c']
       main: globalvar=1
       main: globalarray1=[1, 2, 3, 4]
       main: globalarray2=['a', 'b', 'c']