Search code examples
gitgit-submodulesrebase

How do I rebase a git superproject changing the hashes of the submodules?


Background

Assume we have two git repos, one a submodule of the other (A will be the superproject, B will be the submodule). Project A is not source code per-se, rather a project that gathers and tracks information about its submodule(s). The A repo rarely, if ever, exists on local machines, rather a bunch of scripts keep it updated.

One day, someone realized that repo B should have been using LFS better and cleaned up the repo using git lfs migrate import. I have a list of B's old hashes and new hashes.

What I did

As repo A happens to linear (no branching), I was able to do a git rebase --root -i, change all the commits to edit, and run a simple bash script that reset the submodule to the new hashes. Here's an example of the script:

#!/bin/bash
#set the submodule path and input files
submodulePath=foo
newHashesFile=NewHashes.txt
originalHashesFile=OriginalHashes.txt

while [ (test -d "$(git rev-parse --git-path rebase-merge)" || test -d "$(git rev-parse --git-path rebase-apply)" ) ]; do
    numLines=`git ls-files --stage | grep $submodulePath | wc -l`
    if [ $numLines = 1 ];
    then
        oldHash=`git ls-files --stage | grep $submodulePath | sed -e 's/^160000 \([^ ]*\) 0.*$/\1/g'`
        echo oldHash: $oldHash
    else
        echo merge conflict
        oldHash=`git ls-files --stage | grep $submodulePath | grep '^160000 \([^ ]*\) 3.*' | sed -e 's/^160000 \([^ ]*\) 3.*$/\1/g'`
        echo oldHash: $oldHash    
    fi

    lineNumber=`grep -n $oldHash $originalHashesFile | sed -e 's/^\([^:]*\):.*/\1/g'`
    newHash=`head -n $lineNumber $newHashesFile | tail -n 1`

    if [ ! $lineNumber ];
    then
        echo Hash not changed
    else
        cd $submodulePath
        git reset --hard $newHash
        cd ../
    fi

    git add $submodulePath/
    git commit --amend
    git rebase --continue
done

Question

All this worked, but I was wondering if there is an easier simpler way to do so, as I assume I'll be called on to do this again. There are two parts to that question.

  1. Is there a simple way to tell git that you want the default to be edit instead of pick, not dependent on the editor?
  2. Is there a simpler way of telling git to do what the script does? Would it help if I did the git lfs migrate import from within the superproject?

Solution

  • Is there a simple way to tell git that you want the default to be edit instead of pick, not dependent on the editor?

    No. There is, however, a way to set the sequence-of-commands editor to a separate editor from other editors: set the environment variable GIT_SEQUENCE_EDITOR. So, for instance, you can do:

    GIT_SEQUENCE_EDITOR="sed -i '' s/^pick/edit/" git rebase -i ...
    

    (assuming your sed has a -i that works this way, etc).

    Is there a simpler way of telling git to do what the script does?

    Given that you want to update each gitlink hash, I'd use git filter-branch (rather than git rebase) to do it, with an --index-filter that does the gitlink hash updates. I'm not sure this is any simpler but it's more direct. The index filter itself would consist of using git ls-files --stage similar to the way you do it, but probably itself use a generated sed script, or an awk script. Generated-sed would probably be faster, while awk would be simpler, especially if you have a modern awk where you can just read in the hash mapping.