Search code examples
pythongitpackagesetuptools

How to "fake" a module safely in a Python package


Currently I have the following directory structure in the master git branch:

/dir1
    __init__.py
    module.py

This will be changed to (in my branch):

/dir1
    __init__.py
    /dir2
        module1.py # With 70% of code of module.py
        module2.py # With 30% of code of module.py

Issues:

  1. I know its not possible to make git track both new files, but since git recognizes renames (and it considers organizing into folders as renames) I will be able to track changes to module.py from the master branch to my branch's module1.py, at least for the 70% of the code (I'll have to update module2.py manually). So is there a better way to deal with this?

  2. For API consistency, I'd like people who use older version of my package to still use from dir1.module import abc (without having a module.py in dir1) This can be done like described here, but that comes with the dangers of messing with the sys path variables, which is not advised for stability and safety considerations. Is there a better way I could make the API backward-compatible and still safe/stable?


However, my situation is more complex. For a more representative example, consider moving from:

/dir1
    __init__.py
    module_2.py
        def add2(a, b):
            return a + b
        def sub2(a, b):
            return a - b
    module_3.py
        def add3(a, b, c):
            return a + b + c

to:

/dir1
    __init__.py
    /dir2
        __init__.py
        module_add.py
            # ... Constitutes 70% of code from `dir1/module_2.py`
            def add2(a, b):
                return a + b
            # A few more additional lines added from `dir1/module_3.py`
            def add3(a, b, c):
                return a + b + c
        module_sub.py
            # Constitutes the 30% from /dir1/module2.py
            def sub2(a, b):
                return a - b

So in essence I am splitting up the different functionalities of dir1/module_2.py and dir1/module_3.py and regrouping them into separate module_add.py and module_sub.py and putting it under /dir1/dir2

However, version 1 users getting the version 2 package should still be able to do:

from module_2 import add2, sub2
from module_3 import add3

Things I can't do:

  • Have module_2.py or module_3.py in dir1 (I need git to associate and track the master branch's dir1/module_2.py to dir1/dir2/module_2.py of my branch);
  • Change or mess around sys.path in any way that reduces stability/safety; or
  • Rename dir2 to e.g. module_2.

Solution

  • Note that the following setup:

    /dir1
        __init__.py
            from module import abc
        module.py
            abc = None
    

    is externally (pretty much) indistinguishable from:

    /dir1
        __init__.py
            from module import abc
        /module
            __init__.py
               from module1 import abc
            module1.py  # this is the moved and renamed module.py, with git history
                abc = None
            module2.py  # this is the 30% you've factored out
                # whatever's in here
    

    From outside module.py/module, the old import from module import abc (and from dir1.module import abc, etc.) continues to work.


    For your more complex example, you can still switch from:

    /dir1
        __init__.py
            from module_2 import add2, sub2
            from module_3 import add3
        module_2.py
        module_3.py
    

    to:

    /dir1
        __init__.py
            from dir2.module_add import add2, add3
            from dir2.module_sub import sub2
        /dir2
            __init__.py
            module_add.py  # module_2.py moved & renamed
            module_sub.py  # module_3.py moved & renamed or new file
        /module_2
            __init__.py
               from ..dir2.module_add import add2
               from ..dir2.module_sub import sub2
        /module_3
            __init__.py
               from ..dir2.module_add import add3
    

    The old code (e.g. from dir1.module_2 import add2) will still work correctly, but users can now start accessing the new locations (e.g. from dir1.dir2.module_add import add2, add3).


    You can also add e.g.:

    import warnings
    warnings.warn("deprecated", DeprecationWarning)
    

    to the __init__.py files in /dir1/module_2 and /dir1/module_3 to provide warnings to users that these imports are now on the way out. For example:

    >>> import warnings
    >>> warnings.simplefilter('always')
    >>> from dir1.dir2.module_sub import sub2
    >>> sub2(1, 2)
    -1
    >>> from dir1.module_3 import add3
    
    Warning (from warnings module):
      File "dir1\module_3\__init__.py", line 2
        warnings.warn("deprecated", DeprecationWarning)
    DeprecationWarning: deprecated
    >>> add3(1, 2, 3)
    6