Search code examples
pythonlanguage-lawyerpython-importpython-3.7circular-dependency

Why does Python 2 try to get a module as a package attribute when using "import ... as ..."?


EDIT: closely related to Imports in __init__.py and import as statement. That question deals with the behaviour of import ... as ... for up to Pyhon 3.6. The change in behaviour I'm describing below was introduced in Python 3.7, with the intention to fix the bug described in that other question. I'm more interested in where the change is documented (or where the two different behaviours, for Py2 up to Py3.6 vs Py3.7+, are respectively documented) rather than how exactly this behaviour arises (as I already mostly understand that as a result of experimenting in preparation for this question).


Consider the following directory structure:

.
└── package
    ├── __init__.py
    ├── a.py
    └── b.py

The __init__.py file is empty. The two modules package.a and package.b contain, respectively:

# package.a
import sys

print('package.b' in sys.modules)
import package.b as b

spam = 'ham'
print("{} says b is {}".format(__name__, b))
# package.b
import package.a

print("{} says package.a.spam is {}".format(__name__, repr(package.a.spam)))


With Python 3.x (specifically 3.8), When I run python -c "from __future__ import print_function; import package.b" from the root directory, I get

True
package.a says b is <module 'package.b' from 'C:\\[...]\\package\\b.py'>
package.b says package.a.spam is 'ham'

but with Python 2.x (specifically 2.7) I get

True
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "package\b.py", line 1, in <module>
    import package.a
  File "package\a.py", line 4, in <module>
    import package.b as b
AttributeError: 'module' object has no attribute 'b'

The question is: what warrants this difference? Where is this change documented, e.g. the Python docs, a PEP or similar?



I get that package.b hasn't finished initialising when package.a is imported, so the module object for package.b hasn't yet been added as an attribute of the module object for package. But yet the module object itself exists (as it is added to sys.modules), so there shouldn't be any trouble binding the name b to that object, which is what Python 3 does I believe? Python 2 seems like it's not binding it directly to the module object, but rather trying to fetch it by getting an attribute named 'b' from the module object for package.

As far as I can see, there is no such specification in the documentation.

Import statement (Python 3.8):

If the requested module is retrieved successfully, it will be made available in the local namespace in one of three ways:

  • If the module name is followed by as, then the name following as is bound directly to the imported module.

[...]

Import statement (Python 2.7):

The first form of import statement binds the module name in the local namespace to the module object, and then goes on to import the next identifier, if any. If the module name is followed by as, the name following as is used as the local name for the module.



Notes:

  • Using from package import b in package/a.py yields the same, only with a different error (i.e. ImportError instead of AttributeError). I suspect the ImportError is just wrapping the underlying AttributeError.
  • Using import package.b in package/a.py doesn't give the AttributeError upon import in Py2. But, of course, referencing package.b later in the print call produces an AttributeError in both Py2 and Py3.

Solution

  • If you do

    import package.submodule
    

    and then try to access package.submodule, that access is an attribute lookup on the module object for package, and it will find whatever object is bound to the submodule attribute (or fail if that attribute is unset). For consistency,

    import package.submodule as whatever
    

    performs the same attribute lookup to find the object to bind to whatever.

    This was changed in Python 3.7 to fall back to a sys.modules['package.submodule'] lookup if the attribute lookup fails. This change was made for consistency with a previous change in Python 3.5 that made from package import submodule fall back to a sys.modules lookup, and that change was made to make relative imports more convenient.