Search code examples
gitgit-rebase

Empty commits removed after interactive rebase, even though --keep-empty is used


I have some trouble using the --keep-empty option of git rebase, and I'm not sure whether I'm misunderstanding what this option does, or there's a bug.

Here is a minimal example:

Setup

  1. Create a new Git repository and an initial, unrelated commit.

    $ git init
    $ echo something >base.txt
    $ git add base.txt
    $ git commit -m 'some base commit to not run into the root corner case'
    
  2. Create a new commit which adds two new files.

    $ echo A >a.txt; echo B >b.txt
    $ git add a.txt b.txt
    $ git commit -m 'add A and B'
    
  3. Modify one of the files.

    $ echo A1 >a.txt
    $ git add a.txt
    $ git commit -m 'change A'
    
  4. Modify the other file.

    $ echo B1 >b.txt
    $ git add b.txt
    $ git commit -m 'change B'
    

Rebase

$ git checkout -b rebased master
$ git rebase --keep-empty -i :/base

… choosing to edit the commit where A and B are added, and changing it so that only B is added (in a real scenario the reason might be that A is confidential):

$ git rm a.txt
$ git commit --amend
$ git rebase --continue

Naturally, the next commit where A is modified now gives a conflict:

error: could not apply 182aaa1... change A

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".
Could not apply 182aaa1701ad100fc02a5d5500cacebdd317a24b... change A

… choosing to not add the modified version of a.txt:

$ git mergetool
Merging:
a.txt

Deleted merge conflict for 'a.txt':
  {local}: deleted
  {remote}: modified file
Use (m)odified or (d)eleted file, or (a)bort? d

The commit where A was modified is now empty:

$ git diff --cached
# nothing

… and finishing the rebase:

$ git rebase --continue
Successfully rebased and updated refs/heads/rebased.

Question

So now I have two versions of my history, with the difference that there is no trace of A in one of them. However, because I chose the --keep-empty option, I still expect an empty commit to exist in rebased, which would show me that A would have been modified, had it been there.

But apparently, this is not the case:

$ git log --oneline master
f893569 change B
182aaa1 change A
3340b71 add A and B
38cb5da some base commit to not run into the root corner case

$ git log --oneline rebased
73a2c05 change B
55f502b add A and B
38cb5da some base commit to not run into the root corner case

Is this not what --keep-empty is supposed to do, or does it not work correctly?


Related: Rebase on the root and keep empty commits is a very similar question, but it involves the --root corner case which I explicitly avoided here. And it has no answer, only some comments which suggest that what I'm showing here should work. Another difference is that in the other question the commit is empty in the first place, while here it only becomes empty after resolving a conflict.


Solution

  • It's sort of a bug, due to something that is sort of a feature. :-)

    When you run interactive rebase and it "pauses", in reality, it finishes, but leaves some files around to let a new git rebase realize that it's more of a continuation after all. This is fine as far as it goes; you will need to run git rebase --continue later to start a new rebase and tell it: You're not really new, go read the state and act like you're continuing the original rebase.

    And, let's look at an "interactive rebase". In reality this is mostly a series of cherry-pick operations: the pick command literally instructs the old rebase shell script—which is being phased out now—to run git cherry-pick.

    OK, no big deal so far. But let's consider why an interactive rebase stops. There are two reasons:

    1. You marked a commit "edit". It actually commits the cherry-pick, and stops to let you amend the commit or otherwise fuss with it.

    2. Or, there was a problem—such as a merge conflict—that forced the stop.

    In case (1), when you run git rebase --continue, Git should not make its own commit.

    In case (2), when you run git rebase --continue, Git should make its own commit. That is, it should unless—this is the feature part—you make your own commit first. In that case, for case (2) Git should not make its own commit.

    Git could, and perhaps should, record the reason-for-stoppage so as to tell these two cases apart ... but it doesn't. Instead, it just looks at the state on --continue.

    For a non-interactive rebase, Git knows that it only stops on conflicts, so it knows to try to make a commit, and complain if there is nothing to commit. This is where the --keep-empty or -k flag is useful. (Internally, the non-interactive case uses git format-patch and git am by default, although you can force it to use the interactive machinery with --preserve-merges for instance. I mention this here as it's an implementational reason that Git has to know whether you're being "interactive": as so often happens, here Git lets the implementation dictate the behavior. If Git didn't need this distinction, a --continue could just use the same code for interactive and non-interactive rebase, but Git does need the distinction, and hence doesn't use the same code.)

    For an interactive rebase, though, Git allows you to make your own commit in case (2), just before running git rebase --continue (this is the Feature part). If so, the --continue step should just move on to the next commit. So --continue just checks whether there's something to commit now, rather than whether the earlier interactive rebase exited for case (1) vs case (2). This simple implementation trick enables the feature, but also means that --keep-empty cannot work here: Git just doesn't know the difference.

    The workaround is to do your own git commit --allow-empty after resolving your merge. In other words, convert case (2) into a simulated case (1), using the "you may make your own commit" feature.