Search code examples
pythonpython-importgit-submodulesimporterror

Git submodule raises import error once used in python project


I'm using a git-submodule in my python project.

The submodule project looks like this:

-submodule_project
    - __init__.py
    - debug_util.py
    - parsing_util
        - __init__.py
        - parse.py
        - consts.py

parse.py imports debug_util.py. This structure works fine as the submodule is an independent project.

My project is built like this:

-project
    - __init__.py
    - file1.py
    - some_dir
        - __init__.py
        - main.py

So once I use the submodule as a git submodule in my project, the parse.py raises ImportError. This happens once the line which imports the debug_util.py is ran. Just to clarify: main.py imports parse.py, which imports debug_util.py

Can you explain me what i'm doing wrong, and what are the available solutions to fix this?

Here's my .gitmodules file:

[submodule "submodule_project"]
path = submodule_project
url = ../submodule_project.git

Thanks in advance for all of you!


Solution

  • Git Submodules are very annoying to work with (at least they were last time I played around with them). I'd recommend against using submodules and simply using python's own dependency management. So your submodule_project would have its own unique name and get packaged up in releases like myparser-1.2.1, and then your main project would depend on that package from its setup.py.

    Problems with git submodules (from the git documentation):

    • When you clone [a project with submodules], by default you get the directories that contain submodules, but none of the files within them yet
    • You must run two commands: git submodule init to initialize your local configuration file, and git submodule update to fetch all the data from that project and check out the appropriate commit listed in your superproject
    • If you create a new branch, add a submodule there, and then switch back to a branch without that submodule, you still have the submodule directory as an untracked directory
    • Removing the directory isn’t difficult, but it can be a bit confusing to have that in there. If you do remove it and then switch back to the branch that has that submodule, you will need to run submodule update --init to repopulate it.
    • It’s quite likely that if you’re using submodules, you’re doing so because you really want to work on the code in the submodule at the same time as you’re working on the code in the main project (or across several submodules). Otherwise you would probably instead be using a simpler dependency management system (such as Maven or Rubygems).

    Problems with git submodules (my own observations):

    • You often find your submodules in weird states:
      • You wanted to peg a submodule at a specific git commit, but now it's drifted somehow and your top level project says there are changes involving your submodule.
      • Somehow file keep changing inside the submodule directory and git complains about unstaged changes in either the top level or submodule.
      • You wanted a submodule to track master, but it's not working correctly and now you've got merge commits that aren't upstream.
    • It's annoying enough to update and init one level deep submodules, but what if one of your submodules also uses submodules?
    • A lot of third-party tools don't work well with submodules. I've found that a lot of third-party tools (like some IDEs or web interfaces to git) don't really treat the core parts of git well (dealing with the staging area, merges, rebases, squashing, writing well formatted commit messages, etc), but they're especially bad at features even the most experienced git users rarely ever use.

    You also don't mention how and where you set up the submodule from the top level project. It might be more helpful if you pasted the .gitmodules file from the top level project.