Search code examples
pythongitrepositoryprojectgit-submodules

importing a python module from another repo


I have written a python module. It resides in a repo. Now to get the unit tests for the module to run, they can't be subfolder of that module. The suggested structure to fix this is:

ModuleRepo/
  MyModule/
    __init__.py
    some_utils.py
      import .some_helpers
    some_helpers.py
  Tests/
    some_utils_test.py
      import MyModule.some_utils
    some_helpers_test.py
      import MyModule.some_helpers

Now that works just fine. If I run the tests, they are able to import the module. The module is able to import its own files (eg: some_helpers) by pre-pending '.' to indicate the local folder.

The issue is that a different repo now wants to share this module and I don't know how to make it find the module.

eg:

ModuleRepo/
  MyModule/
    __init__.py
    ...
  Tests/
    ...
AnotherRepo/
  using_my_module.py
    import ??? <-- how to find MyModule?

NOT WORKING ATTEMPT #1: I tried initially to include the ModuleRepo under AnotherRepo using git's submodule functionality. However I dont actually want the root folder for ModuleRepo, I want the subfolder 'MyModule' only. It turns out git's submodule doesn't do that - one cant choose only a part of a repo to include.

UNDESIRABLE SYMLINK: While a symlink might work, its not something one can 'commit' to a repository and so is somewhat undesirable. Additionally I am developing both on Windows and Linux, so I need a solution which works on both.

POSSIBLE SOLUTION: Turn ModuleRepo root into a module too (adding an init.py). Then I could use git to make it a submodule of AnotherRepo. My import would be ugly but it would be: import my.module.some_utils instead of import mymodule.some_utils

Does anyone have any better solutions?


Solution

  • Several possibilities.

    • Tweak the sys.path variable somewhere at the top level of your code to make the ModuleRepo directory listed there.

      The upside is that this approach works with any solution you can use to have these two repositories aside of one another — be it submodules or subtree merging.

      The downside is that you'd need to repeat this tweak also in the test unit(s) of the code in using_my_module.py.

    • Use virtualenv for development.

      A part of the setting up the development environment for the project would be installing of "MyModule" "the regular way".

    • If the "MyModule" module is not in flux and/or you are okay with manual periodical incorporation of the developments happening in "MyModule" into your "main" code base, you can go with so-called "vendoring" by means of using git subtree split and git subtree add commands (or the so-called "subtree merging" instead of the latter).

      Basically, the git subtree split command allows you to extract out of the repo hosting "MyModule" a synthetic history graph containing only the commits which touch files under the specified prefix — "MyModule", in your case, and the git subtree add allows you to "subtree-merge" that subgraph at the specified prefix in another repository.