Search code examples
gitmergegit-mergesquashgit-squash

Git: How to convert an existing `merge` to a `merge --squash`?


I performed multiple merge commits but they should have been merge --squash instead. The conflict resolution took more than a day so I can't afford to redo the merging by hand.

Is there a way to convert the merge to merge --squash ?


Solution

  • It's worth noting here that git merge and git merge --squash are closely related, but git merge --squash does not create a merge.

    The phrasing here is very important, particularly the article "a" in front of "merge": "a merge" is a noun, while "to merge" is a verb. Both commands perform the act of merging. The difference lies in how the result is saved.

    It's also worth a quick reminder, in the form of a diagram of commits, of how merges look. Each round o node here represents a commit, with earlier commits towards the left. Branch names are labels that point to one specific commit (the tip commit of a branch). You might start with this, for instance:

    ...--o--*--o--o   <-- main
             \
              o--o--o   <-- feature
    

    You then decide to merge the one specific feature branch back into the main branch, so you run:

    $ git checkout main && git merge feature
    

    This does a merge (verb) and makes a merge (noun), and the result looks like this:

    ...--o--*--o--o---o   <-- main
             \       /
              o--o--o   <-- feature
    

    Git has added one new commit to main, and this new commit is a merge commit: it points back to both the previous tip of main, and also back (and in this case, downward as well) to the still-current tip of feature. (The name feature continues to point to the same commit as before.)

    What git merge --squash does is to modify the last step. Instead of making a merge commit, it suppresses committing entirely—for no obvious reason—and forces you to run git commit. When you do run git commit, this makes an ordinary commit, instead of a merge commit, so that the result looks like this:

    ...--o--*--o--o---o   <-- main
             \
              o--o--o   <-- feature
    

    There are two key items here:

    • The contents of the new commits are the same. Both new commits are made from the index (the index is Git's term for "what goes into the next commit you make"). This index is set up by the merge-as-a-verb process. The first parent is the "main-line" commit, from the branch we are on when we do the merge. The second parent is the other commit, the one we just merged.

    • The parent linkages of the new commits differ. A real merge has both previous commits as its parents, but a "squash merge" has only one previous commit as its parent—the "main-line" commit. This means it does not—it can not—remember which commit was merged-in.

    In the case of a conflicted (but real) merge, git merge cannot make the new commit on its own, so it stops and forces you to resolve the conflicts. Once you have finished resolving those conflicts, you must manually run git commit, just as you must always (for no obvious reason) do with git merge --squash and its fake merges. You can also request that any real merge stop and let you inspect the result, using git merge --no-commit.

    This leads to the easy method to turn a real merge into a fake (squash) merge, as long as the real merge is not yet committed:

    • For git commit to know to make a merge, it relies on a file left behind by the conflicted (or --no-commit) merge. This file is named .git/MERGE_HEAD. (It also leaves behind a file named .git/MERGE_MSG although this extra file is harmless.)

    • Therefore, you can simply remove .git/MERGE_HEAD and run git commit. (You may want to remove the MERGE_MSG file as well, once you've written your commit with its message. Until then you can use it as the message, or as its starting point.) The git commit step will no longer know to make a merge commit, and will instead make an ordinary commit—et voila, you have made a squash merge.


    If you have already made a real merge commit, the process may be harder. In particular, if you have published the merge, you must now make everyone who has obtained this merge, take it back. Otherwise the real merge will return (they are likely to merge it again), or create problems for them, or even both. If you can do that, though, or if it has not been published, you merely need to "shove the merge aside", as it were, and put in a fake merge commit.

    Let's redraw what we have, to make some room. This is the same graph as before, just spread out onto more lines:

    ...--o--*----o----o 
             \         \
              \         o   <-- main
               \       /
                o--o--o   <-- feature
    

    What if we could move main back up to the top line, then make a new commit? Well, we would get the following drawing:

    ...--o--*----o----o--o   <-- main
             \         \
              \         o   [abandoned]
               \       /
                o--o--o   <-- feature
    

    Note that all the arrows—including the internal commit arrows that the commits use to record history (not shown here as they're too hard to produce in text drawings)—point leftward, so there's no knowledge, in any of the top-line commits, of any of the commits below them: those are all "to their right". Only the bottom-line commits, plus the one we're going to abandon, know anything about the top-line commits.

    Moreover, if we're going to abandon the middle-line commit entirely, let's just stop drawing it, and its two merge arrows:

    ...--o--*----o----o--o   <-- main
             \
              \
               \
                o--o--o   <-- feature
    

    And look at that: we've just drawn the kind of commit graph we want for one of those fake squash "merge"s. This is just what we want, as long as we get the right contents in our new commit.

    But—this is a sort of an exercise; see if you know the answer before plunging on—where do the contents of a new commit come from? The answer is above, in boldface, in the first bullet point. (If you've forgotten it, go back and check.)


    Now that you know that git commit uses the index to make the new commit, let's consider the real merge commit we have now. It has contents. (All commits have contents.) Where did they come from? They came from the index! If we can just somehow get those contents back into the index, we're golden. And in fact, we can: all we have to do is check out that commit, with git checkout, or already have it as the current commit.

    Since we just made the new merge commit, we already have the merge commit as the current commit. The index is clean—if we run git status, it says there's nothing to commit. So now we use git reset --soft to reset the branch pointer main back one step, to get our intermediate drawing. The --soft argument tells Git: "move the branch pointer, but don't change the index and work-tree." So we'll still have the index (and work-tree) from the original merge. Now we just run git commit to make an ordinary commit, supply some appropriate commit message, and we're done: we have a squash merge. The original merge is now abandoned, and eventually—some time after 30 days, by default—Git will notice that it's no longer used and will remove it.

    (To move back one step, you can use HEAD~1 or HEAD^; both mean the same thing. Hence the command sequence is just git reset --soft HEAD^ && git commit, assuming the current commit is the real merge you wish to replace with a fake merge.)


    The longer method above can be used even if you made multiple merge commits. You will have to decide, though, whether you want multiple fake merges, or one big fake merge. For instance, suppose you have this:

    ...--o--o---o--o--o   <-- develop
          \  \    /  /
           \  o--o  /   <-- feature1
            \      /
             o----o   <-- feature2
    

    Do you want your final picture to look like:

    ...--o--o---o--o--o   <-- develop
          \  \
           \  o--o    <-- feature1
            \
             o----o   <-- feature2
    

    where the last two commits on develop are the two fake (squash) merges, or do you just want one big fake-squash:

    ...--o--o---o--o   <-- develop
          \  \
           \  o--o    <-- feature1
            \
             o----o   <-- feature2
    

    where that last commit is the final result? If you want two fake-squashes, you will need to retain both original merge commits long enough to get them into the index and make two ordinary commits from those. If you just want one big fake-squash of the final merge, that takes only two Git commands:

    $ git reset --soft HEAD~2 && git commit
    

    since we'll retain the second merge's index, move back two steps, then make the new commit that shoves aside both merges.

    Again, it's important that none of the real merges have been published—or, if they are published, that you convince everyone else who has picked them up, to stop using them.