Search code examples
python-3.xpython-importpython-importlib

How can I make Python "import.util.module_from_spec" work more like "import""?


I am continuing the code for Ned Batchelder's byterun, a Python interpreter written in Python for Python versions other than Python 3.4. See x-python.

One of the long-standing concerns of this kind of approach is separating the interpreter namespace in imports from the interpreted program namespace.

Aside: Not separating the namespaces can be advantageous if you want fast interpreter which doesn't interpret into the imported modules, but separating the modules is more correct, although slower, and necessary when interpreting bytecode from a different Python version.

So when the interpreter encounters an IMPORT_NAME opcode, I would like to use importlib.util to basically have a copy of the module that is distinct from any import that the interpreter encounters.

The problem I have right now is these import differently and this can be seen using hasattr().

Here is an example:

import importlib

module_spec = importlib.util.find_spec("textwrap")
textwrap_module = importlib.util.module_from_spec(module_spec)
submodule = "fill"
print(hasattr(textwrap_module, submodule)) # False

import textwrap
print(hasattr(textwrap, submodule)) # True

How do I get the same behavior using importlib.util?

(I should note however that for sys, both can find the "path" submodule as an attribute of sys.)


Solution

  • Is the module ever executed? If not, the assignment or declaration within the module has not yet put the child objects into the __dir__ of the module object. From the version of this function in the standard file loader, when you call module_from_spec all you get is the module, and the members that logically belong to all modules. It has no contents.

    (The system may load modules on different paths or in different states of use with other Loader objects, which may complicate this task overall. For sys.path, for instance, the module you get seems to be populated already. Bear in mind that there is a sub-object here for a reason. There is not just one Python Loader.)

    If it is not already loaded, to get the module object populated, you would call

    module_spec.loader.exec_module(module)
    

    (If it exists.)

    For example on Python 3.8:

    import importlib
    module_spec = importlib.util.find_spec('textwrap')
    module = importlib.util.module_from_spec(module_spec)
    module_spec.loader.exec_module(module)
    print(module.fill)
    

    will output:

    <function fill at 0x7fae70e6fd90>
    

    In older versions of Python, exec_module is missing, and you need to call:

    module_spec.loader.load_module(module.__name__)
    

    On 3.6 and later, however, the code above works as expected.

    The information was gleaned by looking at importlib._bootstrap.py::_exec, and the things that it calls.

    There is still a question here as to whether imports inside the loaded module will end up in the global namespace, or your own. But that is another issue.