Search code examples
pythonlinuxpython-importcpython

Where/how does the name `posix` get resolved by an import statement?


What happens behind the scenes (in CPython 3.6.0) when code uses import posix? This module doesn't have a __file__ attribute. When starting the interpreter in verbose mode, I see this line:

import 'posix' # <class '_frozen_importlib.BuiltinImporter'>

It's already present in sys.modules in a newly openened interpreter, and importing it just binds a name to the existing module.

I'm trying to look at implementation detail of os.lstat on my platform to determine if and when it uses os.stat.


Solution

  • Here, have more detail than you're likely to need.


    posix is a built-in module. When you hear "built-in module", you might think of ordinary standard library modules, or you might think of modules written in C, but posix is more built-in than most.

    The posix module is written in C, in Modules/posixmodule.c. However, while most C modules, even standard library C modules, are compiled to .so or .pyd files and placed on the import path like regular Python modules, posix actually gets compiled right into the Python executable itself.


    One of the internal details of CPython's import system is the PyImport_Inittab array:

    extern struct _inittab _PyImport_Inittab[];
    
    struct _inittab *PyImport_Inittab = _PyImport_Inittab;
    

    This is an array of struct _inittabs, which consist of a name and a C module initialization function for the module with that name. Modules listed here are built-in.

    This array is initially set to _PyImport_Inittab, which comes from Modules/config.c (or PC/config.c depending on your OS, but that's not the case here). Unfortunately, Modules/config.c is generated from Modules/config.c.in during the Python build process, so I can't show you a source code link, but here's part of what it looks like when I generate the file:

    struct _inittab _PyImport_Inittab[] = {
    
            {"_thread", PyInit__thread},
            {"posix", PyInit_posix},
            // ...
    

    As you can see, there's an entry for the posix module, along with the module initialization function, PyInit_posix.


    As part of the import system, when trying to load a module, Python goes through sys.meta_path, a list of module finders. One of these finders is responsible for performing the sys.path search you're likely more familiar with, but one of the others is _frozen_importlib.BuiltinImporter, responsible for finding built-in modules like posix. When Python tries that finder, it runs the finder's find_spec method:

    @classmethod
    def find_spec(cls, fullname, path=None, target=None):
        if path is not None:
            return None
        if _imp.is_builtin(fullname):
            return spec_from_loader(fullname, cls, origin='built-in')
        else:
            return None
    

    which uses _imp.is_builtin to search PyImport_Inittab for the "posix" name. The search finds the name, so find_spec returns a module spec representing the fact that the loader for built-in modules should handle creating this module. (The loader is the second argument to spec_from_loader. It's cls here, because BuiltinImporter is both the finder and loader.)

    Python then runs the loader's create_module method to generate the module object:

    @classmethod
    def create_module(self, spec):
        """Create a built-in module"""
        if spec.name not in sys.builtin_module_names:
            raise ImportError('{!r} is not a built-in module'.format(spec.name),
                              name=spec.name)
        return _call_with_frames_removed(_imp.create_builtin, spec)
    

    which delegates to _imp.create_builtin, which searches PyImport_Inittab for the module name and runs the corresponding initialization function.

    (_call_with_frames_removed(x, y) just calls x(y), but part of the import system treats it as a magic indicator to strip importlib frames from stack traces, which is why you never see those frames in the stack trace when your imports go wrong.)


    If you want to see more of the code path involved, you can look through Lib/importlib/_bootstrap.py, where most of the import implementation lives, Python/import.c, where most of the C part of the implementation lives, and Python/ceval.c, which is where the bytecode interpreter loop lives, and thus is where execution of an import statement starts, before it reaches the more core parts of the import machinery.

    Relevant documentation includes the section of the language reference on the import system, as well as PEPs 451 and 302. There isn't much documentation on built-in modules, although I did find a bit of documentation targeted toward people embedding Python in other programs, since they might want to modify PyImport_Inittab, and there is the sys.builtin_module_names list.