Search code examples
gitgit-rebasegit-squash

Git rebase interactive error


I want to merge my branch (lets call it my_branch) to master. It has 1 file showing up as a merge conflict on my pull request on github.com . Before rebasing I want to squash my commits (11 commits). So I did something like this:

# On master
git pull
git checkout my_branch
# on my_branch
git fetch
git rebase -i origin/master

This opened a vim editor with all my commit - I kept the 1st one as pick and changed the rest to s (squash)

pick commit1
s commit2
s commit3
.
.
.
s commit 11

When I save and quit I get an error - error: could not apply e7ce468... 'commit1 message'

Can anyone explain to me what the issue is? I know I cannot rebase with out squashing as each and every commit will then need to be resolved..


Solution

  • You said:

    Before rebasing I want to squash my commits (11 commits)

    (emphasis mine). But soon enough, you ran a git rebase that is going to do the squashing during or after rebasing, and that is going to be a bit of a problem. We'll get back to that in a moment.

    So I did something like this ...

    It's always iffy to say "I did something like <fill in the blank>" because then we don't know what you actually did, which makes it hard to guess what's really going on. :-) Fortunately, these all make a lot of sense:

    # On master
    git pull
    

    This wasn't technically necessary yet, but let's cover what it does. The git pull command is simply a convenience shortcut for git fetch followed by a second Git command. The second Git command defaults to git merge, but you can configure it to be git rebase. Let's assume for now that the second command was just git merge and/or that it did nothing of particular importance to us, which is probably true.

    (The git fetch part is always safe and didn't hurt anything.)

    git checkout my_branch
    # on my_branch
    git fetch
    git rebase -i origin/master
    

    Unless the upstream is changing rapidly, this second git fetch—the first was in git pull—was unnecessary, but as always, extra fetching is harmless, and could be a good habit. The problem is with the git rebase -i origin/master, which starts rebasing. You then edited the instructions to do squash but all the squashes will happen during the rebase, i.e., you won't be squashing before rebasing onto the new tip commit from origin/master.

    Given that there's an upcoming conflict, it is going to happen somewhere during the 11 commits you're copying. It seems to have happened right in the very first one:

    When I save and quit I get an error - error: could not apply e7ce468... 'commit1 message'

    The nice thing about this early merge conflict is it's much easier to explain than a later merge conflict, although at this point the things you can do about it are the same here as anywhere else.

    The error tells you that Git was not able to complete the copy of the very first commit.

    You can stop the in-progress rebase, putting everything back the way it was before you even started, using:

    git rebase --abort
    

    Read on to decide whether you want to do that and try a different approach.

    Rebase works by copying commits

    When working with Git, you need to keep the "commit graph" in mind, if only as a sort of background, shadowy, ominous looming thing :-) most of the time. Actually the commit graph, although it can be intimidating, is meant to be helpful. It's just often very large and "loom-ish". (Think of it as a really big, but friendly, dog, perhaps.)

    The commit graph is something we can draw, and if we do, it comes out looking something like this:

    ...--o--o--*--o--o--o--o--L        <-- origin/master
               |
               |
                \
                 A--B--C--D--E--F--G--H--I--J--K   <-- my_branch
    

    Each of these single letter names is short for an actual commit ID (one of those big ugly 40-character hashes like 7c56b20857837de401f79db236651a1bd886fbbb). The round o nodes represent more commits that I don't need to say anything about, so they're boring. The * is a commit whose ID I don't know, and don't have a name for either, but it's very interesting, so I gave it a star.

    (Older commits are towards the left, and the branch names point to the tip commits of each branch. This last bit is how Git really works: branch names point to tip commits, and each commit points backwards, to its earlier parent commit. The internal backwards arrows are kind of distracting and annoying, and hard to draw in ASCII, so I just draw these as lines instead.)

    Note that A through K is 11 commits, and L is the tip commit to which origin/master points (after your git fetches brought your Git up to date with the Git repository on origin). I just guessed randomly as to how many ordinary boring o commits to draw in, but it does not really matter.

    When you run git rebase origin/master (with or without -i), Git is going to try to copy all 11 of your commits, with the copies going "after" commit L. That is, it wants to change the graph to look like this:

    ...--o--o--*--o--o--o--o--L        <-- origin/master
               |               \
               |                A'-B'-C'-...-J'-K'   <-- my_branch
                \
                 A--B--C--D--E--F--G--H--I--J--K   [abandoned]
    

    (When you make this an interactive rebase and use squash, Git simply modifies the copying process so that it first makes A' as usual, but then makes B' out of A' + B and links that to L, then makes C' out of B' + C, and so on. The key here is that Git is still doing everything one step at a time.)

    The problem here is that each step can result in a merge conflict. It's not clear whether each step will—we know the first one does, but we don't know much about the rest, yet.

    If only there were a way to squash all 11 commits while leaving them attached to the commit I marked *. That is, instead of copying every commit one at a time, adding them on to L, what if we could get Git to just make one new commit AK that represents the sum of A+B+C+...+J+K? We might draw this nice result like this:

    ...--o--o--*--o--o--o--o--L        <-- origin/master
               |\
               | AK   <-- my_branch
                \
                 A--B--C--D--E--F--G--H--I--J--K   [abandoned]
    

    and now we could do a simple rebase of AK (which will still conflict with L, but we'll only have to solve problems once).

    Rebase: choose your target

    Well, it turns out this is really quite easy. What we need is to run git rebase -i as before, but tell it: don't rebase onto L, rebase onto *. That is, copy the chain from where it comes off the other chain, to where it is right now.

    We're already based on *, but if we "rebase" onto where we are now, we can easily squash everything: there will be no conflicts and the squashes will just work. We just need to name commit * somehow.

    How can you find commit *? One way is to use git log, which shows you all the commit IDs. Cut and paste the ID of the commit you care about:

    $ git rebase -i <id-of-commit-*>
    

    and squash as before and Git will do what you want: squash before rebasing. (Well, not exactly "before", but as a littler, squash-y rebase, before the big rebase.)

    Another way to find the ID of commit * is to use git merge-base:

    $ git merge-base my-branch origin/master
    

    This prints out the ID, ready for cut-and-paste (or you can use shell syntax to insert the ID directly into commands, or set a variable, or whatever).

    There's a shortcut too

    Feel free to ignore this shortcut idea, but you can just:

    $ git checkout my_branch
    $ git reset --soft <id-of-commit-*>
    $ git commit
    

    What this does is to abandon the A--B--...--J--K chain (making my_branch point to commit *), but keep the files in the work-tree and index as they are in commit K, then make a new commit using the current index. So this makes a commit whose tree matches that of K, but whose parent is commit *. (You do have to write a whole new commit message, while squashing lets you edit a message from the original 11 commits instead.)

    Or, just do the conflict resolution

    No matter what, you'll eventually want to rebase either the new AK commit, or the original A--B--...--J--K chain, onto commit L. This will require resolving the merge conflicts.

    The advantage to doing it with AK is that you only have to resolve once, instead of once per conflicting commit. The disadvantage is that it may be clearer what to do when resolving one smaller conflict at a time: perhaps A has a conflict that's obvious, and J has one that's different and also obvious, and K has one that's similar to A but still obvious ... but when you rebase the combined AK all at once, the clash between A and K, plus the distraction of J, makes it hard to see how to resolve it.

    Hence, the plus to squashing first is fewer commits that may conflict. The minus to squashing first is fewer commits that may conflict. It's up to you which one to try.

    Some final thoughts

    No matter what you do, remember that Git always works by adding new commits. The old ones, even if they're "abandoned", are still in your repository. They stick around for at least 30 days by default, and their IDs are saved away in Git's "reflogs". If you want to get the old commits back, you can simply make a new branch or tag name using the saved IDs.

    There's one reflog for HEAD and one for each branch. The my_branch reflog is the easier one to use for recovering from rebases.