Search code examples
pythonpython-sphinxapi-design

Refactoring a module and keeping backward compatibility, including for intersphinx


Given a python package pack providing the pack.foo.Bar class:

pack/
    __init__.py  # empty
    foo.py
        # content of foo.py
        """
        This module does stuff using the :class:`pack.foo.Bar` class.
        """

        class Bar(object):
            pass

        # much more code here

I want to refactor the pack.foo module into a package, so that the Bar class is moved to the pack/foo/bar.py file. In order to keep backward compatibility, I can had this to the pack/foo/__init__.py file:

"""
This module does stuff using the :class:`pack.foo.Bar` class.
"""

from pack.foo.bar import Bar
__all__ = ['Bar']

Users of the API can still use from pack.foo import Bar.

One issue remains: references when using sphinx. When sphinx parses the docstring in pack/foo/__init__.py, it can't find the target:

WARNING: py:class reference target not found pack.foo.bar.Bar

Which would break the documentation made by users when using the intersphinx extension.

What is the proper way of refactorying a package structure and still keep a full backward compatibility, including the sphinx object inventory?


Solution

  • Here is my own answer with some findings.

    There is no silver bullet in this situation.

    First, code documentation generated by sphinx-apidoc will have the module layout infered from the file layout. This means that the Bar class defined in pack/foo.py will be documented as pack.foo.Bar, no matter what import mangling happens in pack/__init__.py.

    Second, one can still use the autodoc extension. Autodoc simply tries to import the documented symbols normaly as they are defined in restructured text. This way, you could generate HTML documentation for the Bar class using

    .. autoclass:: pack.Bar
        :members:
    

    There is a catch though. Any documented symbol (and each of their dependencies, transitively) must be used with the exact same namespace that is intented to be documented. Consider a variation of our example, providing an additional class Baz:

    pack/
        __init__.py
            # content of __init__.py
            from pack.foo.bar import Bar, Baz
            __all__ = ['Bar', 'Baz']
    
        foo.py
            # content of foo.py
            """
            This module does stuff using the :class:`pack.foo.Bar` class.
            """
    
            class Bar(object):
                pass
    
            class Baz(Bar):  # Here, sphinx thinks that Baz inherits from
                pass         # pack.foo.Bar because Bar.__module__ is
                             # pack.foo in this context.
    

    Sphinx will fail to import pack.foo.Bar since it is importable only as pack.Bar because of the content of pack/__init__.py.

    In order to make this work, one must find a way to only use the exact import layout provided by the package's API in the API's code itself. This may be achievable, in our example for instance, by defining the Bar and Baz classes in separate files. Good luck and be wary of cyclic imports!