Search code examples
gitgit-rebase

How to rebase only specific commits?


When I do git rebase -i <commit> it pulls up a list of commits starting from that commit and I have to choose which commits I want to edit by changing pick to e. But when I change pick to e and close out the editor git still iterates through all the commits and not just the ones I want to edit.

For example, I do git rebase -i --root and only choose to edit the latest commit made. Git still tries to iterate through the entire list of commits. After I close out the editor it says Rebasing (1/593) where 593 is the number of commits in the list. It goes through all 593 them. It only stops to let me edit the ones I chose.

Is there a way to target only a specific commit, even if it's in between a lot of other commits, without having rebase go through the entire list?


Solution

  • The way to understand this is to understand what Git is. Git follows some extremely basic and important rules. Here are two of them:

    1. No commit can ever be changed.

    2. The parent pointer of a commit is part of that commit (and therefore, in accordance with rule 1, cannot be changed).

    So now consider this situation:

    A -- B -- C -- D (mybranch)
    

    where time flows from left to right: A was created first, and is the parent of B, which is parent of C, which is parent of D.

    Now let's say I want to change the commit message of A. Well, I can't! You can't change anything about a commit. But what you can do is replace A with a different commit — one that has a different commit message (but contains the same files).

    But if you do that, you have this:

    A -- B -- C -- D (mybranch)
    
    A'
    

    where A' is the new commit that is like A, but differs in its commit message. This is no good, because it isn't what we wanted. We want the history to stay "the same", with A' as the parent of B in the history. Well, we can't do that! You can't change the parentage of B. But what you can do is replace B with a new commit that looks like B but has A' as its parent:

    A -- B -- C -- D (mybranch)
    
    A' -- B'
    

    But wait, there's more! We have to keep doing that, for all the subsequent commits. And when we get done, Git simply moves the mybranch pointer to the new history:

    A -- B -- C -- D
    
    A' -- B' -- C' -- D' (mybranch)
    

    That is what happens when you use interactive rebase to edit the commit message of A. You get all new commits for A and all the commits after it.

    And that is exactly what you are seeing and asking about.


    To see that this is true, try it on an example repo, and watch the SHA numbers. Here's an example. I use git log to see what I have at the start:

    * 0c32c25 (HEAD -> main) d
    * 12ec6ca c
    * 28b1e17 b
    * b8cb561 a
    

    Now I interactive rebase down to the root; here's how I edit the todo list:

    r b8cb561 a
    pick 28b1e17 b
    pick 12ec6ca c
    pick 0c32c25 d
    

    I reword the first commit so that its message is a with a different commit message. Now the rebase finishes, and this is what I get:

    * 0931c89 (HEAD -> main) d
    * 45ebf0a c
    * 084b06c b
    * bcc0bed a with a different commit message
    

    Look at the SHA numbers. They have all changed. That's because these are not the same commits I started with! They have all been replaced.


    One final observation. If you're watching carefully, you should be saying: In the diagram, what happens to the original A thru D that have been "replaced"?

    A -- B -- C -- D
    
    A' -- B' -- C' -- D' (mybranch)
    

    It looks like they are still there. Yes, you're right! They are still there. After your big rebase, all your commits are duplicated. The new versions exist, and so do the old versions. This is actually a really cool feature of Git: commits are not erased when they are replaced.

    In this particular situation, you have no easy way to access the original A thru D. But you could if you wanted to. For example, you could use the reflog. Or you could have put another branch name on D before doing the rebase.

    Eventually, the repo will be "garbage collected". Git will notice that no branch name points to D, and therefore to C, B, and A. They are considered "unreachable", and Git will delete them. But it could be weeks before that happens. You have lots of time to recover the originals if that's what you want to do. Cool, eh?