Search code examples
pythonreferenceglobal-variables

Referencing Environment with Global Python Variables


In the following program, there is a global variable, a, that I want to check the number of references to at each of the points 1, 2, 3, and 4.

import sys

a = 'my-string'
print(sys.getrefcount(a))

b = [a]
print(sys.getrefcount(a))

del b
print(sys.getrefcount(a))

c = { 'key': a }
print(sys.getrefcount(a))

When I executed the program, I saw that the number of references were 4, 5, 4, 5, respectively. I do not understand why this is so.

My first thought was that all the global variables are allocated from the static segment of the stack, so, for each of a, b = [a], and c = { 'key': a }, there is a reference, making up 3 references. And for each print(sys.getrefcount(a)), there is a reference, making up 4 references in total.

If that is the case, where does the 5th reference come from in the statements after b = [a] and c = { 'key': a }, especially after b is deleted?


Solution

  • I believe the answer to what the refreneces are is:

    • Unknown internal, not the string cache since this also happens for other immutable literals like floats or ints (outside the -5 to 255 range). Maybe an artifact of the parsing?
    • The constant table of the code object for the module
    • The entry in the locals of the frame for the code object (in this case, the globals() dictionary)
    • The refrences that was created when passing the string object to getrefcount
    • And the one that changes: The reference in the collections.

    You can observe a few interesting things about the way CPython works:

    • Since same literals are the exact same object through the entire code, if you refer to "my-string" directly instead of a throughout the entire code the numbers wont change
    • If you use literals like floats, large/negative ints, non-empty tuples and similar stuff, the behavior will be the exact same.
    • If you use small ints, empty tuples, boolean literals or similar magic objects you will see different things depending on your version of CPython: Either a number that is quite a bit higher than for everything else, or you will see a very large number meaning that the highest bit has been flipped to indicate an imoortal object.
      • This also includes interned strings you might not expect, like __init__.
    • If you use mutable objects like a list literal [], or object(), set(), you will instead see a pattern 2,3,2,3, indicating that the first two entires in the list above are no longer present.