Search code examples
pythonscopegloballookupbuilt-in

Does the name *open* belong to the built-in or the global scope in this example?


Consider this code snippet:

global open
print(open)

which gives the following result:

<built-in function open>

My question is: Does the name open belong to the built-in or the global scope in this example?

I thought that the global declaration will force the name open to be mapped to the global scope (and, thus, will lead us to an error), which is not happening here. Why?


Solution

  • First, the direct answer:

    The name open belongs to the top-level namespace. Which essentially means "look up in globals, fallback to builtins; assign to globals".

    Adding global open just forces it to belong to the top-level namespace, where it already was. (I'm assuming this is top-level code, not inside a function or class.)

    Does that seem to contract what you read? Well, it's a bit more complicated.


    According to the reference docs:

    The global statement is a declaration which holds for the entire current code block. It means that the listed identifiers are to be interpreted as globals.


    But, despite what other parts of the docs seem to imply, "interpreted as globals" doesn't actually mean "searched in the global namespace", but "searched in the top-level namespace", as documented in Resolution of names:

    Names are resolved in the top-level namespace by searching the global namespace, i.e. the namespace of the module containing the code block, and the builtins namespace, the namespace of the module builtins. The global namespace is searched first. If the name is not found there, the builtins namespace is searched.

    And "as globals" means "the same way that names in the global namespace are looked up", aka "in the top-level namespace".

    And, of course, assignment to the top-level namespace always goes to globals, not builtins. (That's why you can shadow the builtin open with the global open in the first place.)


    Also, notice that, as explained in the exec and eval docs, even this isn't quite true for code run through exec:

    If the globals dictionary does not contain a value for the key __builtins__, a reference to the dictionary of the built-in module builtins is inserted under that key. That way you can control what builtins are available to the executed code by inserting your own __builtins__ dictionary into globals before passing it to exec().

    And exec is, ultimately, how modules and scripts get executed.

    So, what really happens—at least by default—is that the global namespace is searched; if the name is not found, the global namespace is searched for a __builtins__ value; if that's a module or a mapping, it's searched.


    If you're curious how this works in CPython in particular:

    • At compile time:
      • The compiler builds a symbol table for a function, separating names out into freevars (nonlocals), cellvars (locals that are used as nonlocals by nested functions), locals (any other locals) and globals (which of course technically means "top-level namespace" variables). This is where the global statement comes into play: it forces the name to be added to the global symbol table instead of a different one.
      • Then it compiles the code, and emits LOAD_GLOBAL instructions for the globals. (And it stores the various names in tuple members on the code object, like co_names for globals and co_cellvars for cellvars and so on.)
    • At runtime:
      • When a function object gets created from compiled code, it gets __globals__ attached to it as an attribute.
      • When a function gets called, its __globals__ becomes the f_globals for the frame.
      • The interpreter's eval loop then handles each LOAD_GLOBAL instruction by doing exactly what you'd expect with that f_globals, including the fallback to __builtins__ as described in the exec docs.