Someone not familiar with git committed on his branch, and then made a merge commit with the develop
branch. When merging, he:
Now I want to keep the part in 1 and 2, but revert the 3rd one, what should I do? Noted that his branch has been pushed to the remote so I hope that reset
can be avoided.
What I have tried:
git revert <commit-id> -m 1
and get back to the commit before mergingWhat I was expecting here should be the same as git reset head^; git merge develop
but it seems that I do not understand revert
correctly.
There is no right answer to this particular problem. There are only answers that leave a few problems, and answers that leave many problems. The badness of each of these problems depends on your particular situation:
For instance, using git reset
to strip the merge, followed by a git push --force
, creates problems for anyone else using the remote clone. But perhaps only one other person is using that clone, and that one other person already knows what to do, or can be instructed as to what to do.
In this case, the "badness" of stripping the bad merge and starting over is relatively small, especially since you can keep the good resolutions (although this requires manual work and a lot of Git knowledge). Once you're done, nobody ever has to deal with the bad merge again, which leaves things in a nice state.
But perhaps many people are using that remote repository, and stripping out the bad merge would cause irreparable damage. In that case, the "badness" of stripping the bad merge is enormous, and you should use another strategy.
The main thing to remember is that a Git repository is, in the end, nothing more or less than a collection of commits. The commits in the repository are the history and are the repository.1 So, whatever you end up doing, you will add commits to the repository. To fix a bad merge commit, you must add more commits.
These need not be merge commits. You can leave the existing merge in place, and simply remember it (or mark it—see git notes
) as "bad, do not use". You can then add ordinary (non-merge) commits that fix the problem.
Each commit stores a full snapshot of every file. Commits do not contain differences from a previous commit. So a bad merge commit is simply a commit with some files having the wrong contents. A subsequent non-merge commit can store files with the right contents.
Your problem thus boils down to two parts:
You must decide whether or not to remove the bad merge. This is a value judgment, with no right answer.
You must come up with the corrected contents. This is a mechanical problem: how will you produce the correct files? Here, Git can help.
Let me get a footnote out of the way, and then describe how Git can help.
1This is a mild overstatement: there may be git notes
, although technically those are stored in commits anyway, and tags; and humans attach significance to branch names, which are also in the repository, but are rather ephemeral and should not be depended-on quite so heavily.
A true merge, in Git, is an operation on three input commits.2 The three commits include your current commit, as selected by your current branch name and the special name HEAD
. You give Git another commit on the command line: when you run git merge other-branch-name
or git merge hash-id
, Git uses this to locate the other branch tip commit. For much more on how branch tips work, and how HEAD
works, see Think Like (a) Git. This site will also help understand the next part.
Given these two branch tip commits, Git now finds the third—or in some sense, first—of the three input commits on its own, using the commit graph. Each ordinary, non-merge commit connects, backwards, to some earlier commit. This series of backwards connections must eventually arrive at some common starting point, where the two branches last shared some particular commit.
We can draw this situation like this:
I--J <-- our-branch (HEAD)
/
...--G--H
\
K--L <-- their-branch
Our latest commit, which I've drawn as commit J
, points backwards to some earlier commit(s), which I've drawn as commit I
. Their latest commit L
points backwards to some earlier commit K
. But then I
and K
point backwards to some commit—here, H
—that's on both branches at the same time. Think Like (a) Git has a lot more about how this works, but for our purposes here, we need only see that Git finds commit H
on its own, and that it's on both branches.
When we run git merge
with commit J
as our commit—which Git calls --ours
or HEAD
or the local commit—and commit L
as their commit—Git calls this it either --theirs
, or the remote commit, typically—Git finds commit H
as the merge base. Then it:
Compares the snapshot in commit H
to the snapshot in our commit J
. This finds out what files we changed, and what changes we made to those files.
Compares the snapshot in H
to the one in L
. This finds out what files they changed, and what changes they made to those files.
Combines the changes. This is the hard-work part. Git does this combining using simple text-substitution rules: it has no idea which changes really should be used. Where the rules allow, Git makes these changes on its own; where the rules claim that there is a conflict, Git passes the conflict on to us, for us to fix. In any case, Git applies the combined changes to the snapshot in the starting commit: merge base H
. That keeps our changes while adding theirs.
So, if the merge goes well on its own, Git will make a new merge commit M
, like so:
I--J
/ \
...--G--H M <-- our-branch (HEAD)
\ /
K--L <-- their-branch
New commit M
has a snapshot, like any commit, and a log message and author and so on just like any commit. The only thing that's special about M
is that it links back not just to commit J
—our commit when we started—but also to commit L
, the commit whose hash ID we told git merge
about (either using the raw hash ID, or using the name their-branch
).
If we have to fix up the merge ourselves, we do that and run git add
and then either git commit
or git merge --continue
, to make merge commit M
. When we do this, we have full control over what goes into M
.
2This is the kind of merge that results in a merge commit, i.e., a commit with two parents. Git can also perform what it calls a fast-forward merge, which is not a merge at all and produces no new commit, or what it calls an octopus merge, which takes more than three input commits. Octopus merges have certain restrictions, which means they do not apply to this case. True merges can involve making a recursive merge, which complicates the picture as well, but I'm going to ignore this case here: the complications are not directly relevant to what we'll be doing.
Our situation here is that we started with:
I--J <-- our-branch (HEAD)
/
...--G--H
\
K--L <-- their-branch
Then someone—presumably not us 😀—ran git merge their-branch
or equivalent, got merge conflicts, and resolved them incorrectly and committed:
I--J
/ \
...--G--H M <-- our-branch (HEAD)
\ /
K--L <-- their-branch
To re-perform the merge, we just need to check out / switch to commit J
:
git checkout -b repair <hash-of-J>
for instance, or:
git switch -c repair <hash-of-J>
to use the new (since Git 2.23) git switch
command. Then we run:
git merge <hash-of-L>
To get the two hash IDs, we can use git rev-parse
on merge commit M
, with the funky ^1
and ^2
syntax suffixes; or we can run git log --graph
or similar and find the two commits and see their hash IDs directly. Or, if the name their-branch
still finds commit L
, we can run git merge their-branch
. Git just needs to locate the correct commit.
Git will, at this point, repeat the merge attempt it tried earlier, following exactly the same rules. This will produce exactly the same conflicts. Our job is now to fix up these conflicts, but this time, we do it correctly.
If we like the resolution that someone else made in commit M
, we can ask git checkout
(all versions of Git) or git restore
(Git 2.23 and later) to extract the resolved file that the other person put in commit M
:
git checkout <hash-of-M> -- <path/to/file>
for instance. Even if we don't like the entire resolution, we can still do that and then fix up the file and run git add
; only if we don't like any of the resolution, and want to do the entire fixing-up ourselves, do we have to do the entire fixing-up ourselves.
One way or another, though, we just fix up each file and git add
the result to tell Git that we have fixed up the file. (The git checkout hash -- path
trick makes it so we can skip the git add
step in some cases, but it won't hurt to run git add
anyway either.) When we're all done, we run git merge --continue
or git commit
to finish this merge: the result is a new merge commit M2
or N
, on our new branch repair
or whatever we called it when we created it:
I--J-----M2 <-- repair (HEAD)
/ \ /
...--G--H M / <-- our-branch
\ /_/
K--L <-- their-branch
We can now git checkout our-branch
, which lands us on commit M
, and grab files directly from repair
:
git checkout our-branch
git checkout repair -- path/to/file1
git checkout repair -- path/to/file2
...
and then we're ready to git commit
to make a new commit N
. Or, we can en-masse grab every file from M2
:
git checkout repair -- .
and run git status
, git diff --cached
, and/or git commit
at this point, depending on how sure we are we got this all right.
The result of the above is:
I--J-----M2 <-- repair
/ \ /
...--G--H M-/--N <-- our-branch (HEAD)
\ /_/
K--L <-- their-branch
and we can now delete branch name repair
entirely: commit N
is just "magically fixed".
If we intend to keep commit M2
, we can use git merge
to merge repair
into M
. We might want to run git merge --no-commit
so that we gain full control: this will stop git merge
from making the actual commit yet, so that we can inspect the snapshot that's about to go in to the new merge. Then the final git merge --continue
or git commit
makes N
as a new merge commit:
I--J-----M2 <-- repair
/ \ / \
...--G--H M-/----N <-- our-branch (HEAD)
\ /_/
K--L <-- their-branch
and once again we can delete the name repair
; it no longer adds anything of value.
(I'd generally just make a simple non-merge fixup commit myself, rather than another merge. The merge base for making N
as a merge is both commits J
and L
, which means Git will do a recursive merge unless we specify -s resolve
. Recursive merges tend to be messy and have weird conflicts sometimes.)
Commits that occur after bad-merge-M
just need their changes carried forward into what I have drawn above as final commit N
. How you go about achieving that is not really terribly important, though some ways may have Git do more of the work for you. The thing to remember here is what I said earlier: in the end, it's the commits in the repository that matter. That includes both the graph—the backwards-looking connections from commit to earlier commit—and the snapshots. The graph matters to Git itself, as it is how git log
works and how git merge
finds the merge base. The snapshots matter to you, as they are how Git stores the content that you care about.