Search code examples
gitgit-submodules

How to have a clean repo, including submodules, after a pull request?


I work on a git submodule child_repo located in a parent repo project_repo with more submodules. I push changes and open pull requests in child_repo and it is the only repo (out of all the submodules) I have visibility of (and write rights), but the compilation process of the whole project includes the other submodules.

What would be the best idea to keep the whole parent repo project_repo clean and up to date after the pull request with my changes is accepted?

Currently just cloning the whole project_repo every time I need to start working on a new feature, but I do not think this is the smartest/right way.

My guess goes as following:

  • first discard all files that were not committed with git checkout --<file> or git clean -xdf. This is because I often write scripts/stuff that helps me with the task but cannot go to master.
  • git pull origin master in the project_repo for updating the references
  • git submodule update --recursive --init for updating the contents of the submodules

Solution

  • Thank you for all the detail Victor!

    I'm going to refer to the parent repo as parent_repo instead of project_repo -

    Few questions:

    • first discard all files that were not committed with git checkout -- or git clean -xdf. This is because I often write scripts/stuff that helps me with the task but cannot go to master.

    Is it correct you were thinking of doing this in parent_repo?

    Regardless, I have a pretty good idea of what you want to do.

    Assuming the settings for your child_repo inside parent_repo/.gitmodules have branch set to master, like we in this example below:

    [submodule "src/child_repo"]
     path = src/child_repo
     branch = master # or whatever branch your PR's for child_repo get merged into
    
     # extra (unrelated) notes on other settings here:
     url = ../child.git # if parent url is github.com/company/parent, use a relative url for your child_repo, this can avoid git auth/permission/repo access issues in some ci/cd environments
     fetchRecurseSubmodules = true # hopefully this setting is pointless
     # update = merge # you generally dont want to merge by default
    

    After your PR to child_repo is merged, you have the right idea, but can be optimized slightly.

    Assuming you are doing your development inside parent_repo/src/child_repo, just:

    git status # ensure you have no changes inside your `child_repo`, `git stash --all` if you do have changes.
    git checkout master
    git pull # now child_repo is on master with latest changes, assuming PR was merged into master
    cd ../.. # now you're in `parent_repo` root
    git status # clean things up if you need, stash, etc. Note, `git stash` or any other commands besides `git submodule ...` will NOT affect the state of your submodule. So, `git stash --all` should always work here 
    git stash --all # Only if you have changes _besides_ your src/child_repo submodule pointer
    git checkout master
    git pull # optionally, `git pull origin master` if you like being specific
    git status # should show something like `src/child_repo`
    

    Next we'll run git diff but more than likely, you'll see a red commit hash and a green commit hash. In that case, you'll probably want to run git config --global diff.submodule log, this was me and my co-worker's preferred default setting. More info here: https://www.jvt.me/posts/2018/05/04/git-submodule-diff-formats/

    git diff # verify commits that have been merged in are correct, ensure you aren't deleting any commits
    git diff --submodule=diff # verify actual code changes are also correct (optional)
    

    Assuming you just see a your few green commits' messages being added, you can then proceed to update parent_repo to point to the new version of child_repo. If you can push directly to master (doubt it) you could just:

    git add .
    git commit -m "your_child_repo_name: Learned to handle `--force` setting"
    git push
    

    If you don't have permission to push, you'll have to do ANOTHER PR for this. More than likely, whoever has access to all the submodules should handle updating parent_repo to point to the new child_repo. It'd be very easy for your lead dev/maintainer, they just run:

    git submodule update
    git add .
    git commit -m "update child_repo, child_repo2, child_repo3"
    

    And this will essentially run git pull inside all the submodules. They can then see your new commits that merged merged by running git diff --submodule=log - of course, I'm assuming this lead dev person has permission to push directly to master