Search code examples
pythonpython-importsetuptoolspython-packagingpython-importlib

Implicitly use namespace in imported modules in Python


I'm trying to make a library out of a Python project I don't own. The project has the following directory layout:

.
├── MANIFEST.in
├── pyproject.toml
└── src
    ├── all.py
    ├── the.py
    └── sources.py

In pyproject.toml I have:

[tool.setuptools]
packages = ["mypkg"]

[tool.setuptools.package-dir]
mypkg = "src"

The problem I'm facing is that when I build and install this package I can't use it because the author is importing stuff without mypkg prefix in the various source files.

F.ex. in all.py

from the import SomeThing

Since I don't own the package I can't go modify all the sources but I still want to be able to build a library from it by just adding MANIFEST.in and pyproject.toml.

Is it possible to somehow instruct setuptools to build a package that won't litter site-packages with all the sources while still allowing them to be imported without the mypkg prefix?


Solution

  • It isn't possible without adding a custom import hook with the package. The hook takes the form of a module that is shipped with the package, and it must be imported before usage from your module (e.g. in src/all.py)

    src/mypkgimp.py

    import sys
    import importlib  
    
    class MyPkgLoader(importlib.abc.Loader):
        def find_spec(self, name, path=None, target=None):
            # update the list with modules that should be treated special
            if name in ['sources', 'the']:
                return importlib.util.spec_from_loader(name, self)
            return None
    
        def create_module(self, spec):
            # Uncomment if "normal" imports should have precedence
            # try:
            #     sys.meta_path = [x for x in sys.meta_path[:] if x is not self]
            #     return importlib.import_module(spec.name)
            # except ImportError:
            #     pass
            # finally:
            #     sys.meta_path = [self] + sys.meta_path
    
            # Otherwise, this will unconditionally shadow normal imports
            module = importlib.import_module('.' + spec.name, 'mypkg')
            # Final step: inject the module to the "shortened" name
            sys.modules[spec.name] = module
            return module
    
        def exec_module(self, module):
            pass
    
    if not hasattr(sys, 'frozen'):
        sys.meta_path = [MyPkgLoader()] + sys.meta_path
    

    Yes, the above uses different methods described by the thread I have linked previously, as importlib have deprecated those methods in Python 3.10, refer to documentation for details.

    Anyway, for the demo, put some dummy classes in the modules:

    src/the.py

    class SomeThing: ...
    

    src/sources.py

    class Source: ...
    

    Now, modify src/all.py to have the following:

    import mypkg.mypkgimp
    from the import SomeThing
    

    Example usage:

    >>> from sources import Source
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ModuleNotFoundError: No module named 'sources'
    >>> from mypkg import all
    >>> all.SomeThing
    <class 'mypkg.the.SomeThing'>
    >>> from sources import Source
    >>> Source
    <class 'mypkg.sources.Source'>
    >>> from sources import Error
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: cannot import name 'Error' from 'mypkg.sources' (/tmp/mypkg/src/sources.py)
    

    Note how the import initially didn't work, but after mypkg.all got imported, the sources import now works globally. Hence care may be needed to not shadow "real" imports and I have provided the example to import using the "default"[*] import mechanism.

    If you want the module names to look different (i.e. without the mypkg. prefix), that will be a separate question, as code typically don't check for their own module name for functionality (and never mind that this actually shows how the namespace is implicitly used - changing the actual name is more akin to a module relocation, yes this can be done, but a bit more complicated and this answer is long enough as it is).

    [*] "default" as in not including behaviors introduced by this custom import hook - other import hooks may do their own other weird shenanigans.