Search code examples
pythonpython-importpython-extensions

Is the import order of extensions in module filenames guaranteed in Python?


Experimentally, I verified that when a compiled extension.pyd (or .so) and plain extension.py both exist in the same directory, the .pyd file gets imported first ; the .py is only imported if the .pyd file is not found:

In [1]: import extension

In [2]: extension.__file__
Out[2]: 'extension.pyd'

In [3]: import glob; glob.glob("extension.py*")
Out[3]: ['extension.py', 'extension.pyd']

Is that guaranteed to be the same for all versions of Python, and can I rely on this to add logic to the .py file that is only executed when the .pyd file is not found?


Solution

  • FWIW, I was not able to find a reference stating, that extensions must be loaded before py-files, thus it is probably safer to treat it as an implementation detail (unless somebody provides a reference). Even if this details is stable for all versions at least back to 2.7.

    When a module is imported, it first looked-up in the cache (i.e. sys.modules) and if not yet there, the finders from sys.meta_path are used. Usually, sys.meta_path consist of BuiltinImporter, FrozenImporter and PathFinder, where PathFinder is responsible for finding the modules on disk/python-path.

    PathFinder provides some caching functionality to speed-up the look-up, but it basically delegates the search to hooks from sys.path_hooks - an overview can be found for example in PEP 302.

    Usually, sys.path_hooks consist of zipimporter, which make the import of zipped files possible, and a wrapped FileFinder, which is the working horse of the whole import-machinery.

    FileFinder tries out different suffices (i.e. .so, .py, .pyc) in a given order, which is established by _get_supported_file_loaders()-method:

    def _get_supported_file_loaders():
        """Returns a list of file-based module loaders.
        Each item is a tuple (loader, suffixes).
        """
        extensions = ExtensionFileLoader, _imp.extension_suffixes()
        source = SourceFileLoader, SOURCE_SUFFIXES
        bytecode = SourcelessFileLoader, BYTECODE_SUFFIXES
        return [extensions, source, bytecode]
    

    As one can see:

    • extensions come before source-files (i.e py-files)
    • source-files come before pyc-files

    Obviously, sys.meta_path as well as sys.path_hooks can be manipulated in a way, which establish an arbitrary order of load-preferences.

    As personal note: I would try to avoid the situation where py- and so/pyd-files are next to eachother.