Search code examples
gitgit-cherry-pick

GIT copy commit to another branch when I am in the original branch


To move a commit from one branch to another I need to be on the target branch

The question is whether it is possible to copy a commit when I am in the original branch


Solution

  • TL;DR

    Consider git worktree. Jump to the end to see how, but read through the middle to see why.

    Long

    To move a commit from one branch to another I need to be on the target branch.

    This is not the case. This claim results from a misunderstanding of how Git commits and branch-names work. But it's an understandable mistake, because the way Git commits do work is pretty confusing, and the end-user goal is usually not to move a commit at all, but rather to copy the effect of a commit into a new commit.

    These terms are all a bit clumsy:

    The question is whether it is possible to copy a commit when I am in the original branch.

    Yes—but there's a significant problem here, depending on precisely what you mean by copy a commit. Your tags mention git cherry-pick, which turns a commit into a set of changes, then applies that same set of changes to some other commit. This is one definition of "copying" a commit, and is probably the one you mean, and it's the one that you can't do when "in the original branch" (this phrase is also a bit clumsy: you're never really in a branch, although you can be on a branch).

    Here's what's going on:

    • Commits, in Git, are real things, with permanent existence—well, mostly permanent—that you can transfer from one Git repository to another. Each commit is numbered, with a unique but random-looking hash ID. Each commit holds a full snapshot of every file, plus some metadata: information about who made the commit, when, and why (their log message), for instance.

    • Each commit stores, in its metadata, the hash ID(s) of some earlier commit(s). Most commits store exactly one such raw hash ID, which we call the parent of the commit.

    • Branch names, in Git, are temporary and ephemeral. They come and go as you please. They're not real, in a sense: they act a lot like little yellow sticky notes.

    You simply paste as many sticky-notes as you like on one commit, writing a different "branch name" on each note. That commit is now the last commit on all of those branches. Peel the sticky-note off that commit, paste it on some other commit, and now your newly-chosen (existing) commit is the last commit on that branch.

    You pick one of these sticky notes as your "current branch" by pasting a different-colored note onto the yellow-sticky-note with the branch name on it. (Let's use green for HEAD, for instance.)

    Whenever you make a new commit—with git commit, or by concluding a cherry-pick or merge or whatever—Git will place that new commit at the tip of the current branch. It does that by:

    • looking to which branch name has HEAD pasted onto it (attached to it)
    • stuffing the new commit's new hash ID (unique to this new commit) into the appropriate branch name—or, in our analogy, peeling the appropriate yellow-sticky-note off the old branch tip and pasting it onto the new commit.

    The new commit's parent commit is the old branch-tip commit. That is, suppose we have this to start with:

    ... <-F <-G <-H   <-- somebranch (HEAD)
    

    The name somebranch—our "yellow sticky note" with somebranch written on it—says that the last commit of this chain of commits is commit H.

    Commit H has, inside its metadata, the raw hash ID of earlier commit G. So Git can use the contents of commit H to locate commit G.

    Commit G is of course also a commit, so it has metadata, including the raw hash ID of some still-earlier commit F. Git can therefore use G to find F. Having found F, Git will be able to hop back one more step in history, and so on down throughout all the history from here on backwards.

    Git cannot move forwards this way. Git can only step backwards. To find commit H—the last commit in the chain—Git needs something with H's hash ID saved. That something is our branch name.

    But now we make a new commit. This new commit needs a parent: Git sets its parent to the current commit's hash ID, H. So new commit I points backwards to existing commit H:

    ... <-F <-G <-H <-I
    

    Having done that, Git now writes I's hash ID into the name somebranch, updating the yellow sticky note:

    ... <-F <-G <-H <-I   <-- somebranch (HEAD)
    

    The green sticky note for HEAD is still attached to the same old yellow sticky note; it's just that the number on the yellow sticky note labeled somebranch is now the actual, raw hash ID of commit I.

    This is why we can move commits from one branch to another

    Suppose we have the following commits:

                 I--J   <-- dev
                /
    ...--F--G--H   <-- master
    

    If we simply have Git slide the name master forwards—in the direction Git can't do on its own; we have to give it the name dev so that it can find commit J—we get:

    ...--F--G--H--I--J   <-- dev, master
    

    Commits I and J are now on both branches, even though earlier, they were only on the dev branch. And, now we can delete the dev name entirely, if we like: we had it so that we could find commit J, while the name master found commit H. If delete dev, all commits are now only on master.

    In fact, the commits themselves did not move at all. What moved were the branch names—and then we deleted one. The branch names don't really matter, with the one important exception: they let us find the last commit. If we have no name at all, we can't find the commit at all. There are other kinds of names—tag names, for instance, and remote-tracking names, and so on—but we need some name for the "last" commit in any sequence, so that we can easily name that commit.

    Then, from there, we can have git log or some other Git operation work backwards. That finds some earlier commit. If that earlier commit seems particularly important (or poignant or exciting or whatever), we can attach a name directly to it, to make it easy to locate. The raw hash ID is how Git really finds it, but the names give us mere humans things that we can deal with: random-looking hash IDs are just too hard to get right.

    Cherry-picking

    As I mentioned earlier, what git cherry-pick does is "copy" a commit. Now, a commit holds a full snapshot, not a set of changes. But a commit also holds a parent hash ID. That parent commit has a snapshot, too. Suppose we have some chain of commits:

    ...--G--H   <-- main
             \
              I--J--K   <-- branch
    

    and we've just fixed some terrible bug in commit K and want to import the same change directly to main, right away. Well, K has a snapshot, but its snapshot is built from J's snapshot, with some change to it.

    But Git can easily show us what's different in K, compared to what's in J. That's what git show hash or git log -p would show us for the commit whose hash ID is K. (In fact, we can just run git show branch, as Git will turn the name branch into the hash ID for us.) Git simply extracts both snapshots—the one for J, and the one for K—into a temporary area in memory, and then compares the two. For each file that's the same, it says nothing, and for each file that's different, it shows us what changed.

    We can take that set of changes—the diff from J to K—and have Git apply that set of changes to any other commit. If we first use git checkout or git switch to attach HEAD to main, so that we have this:

    ...--G--H   <-- main (HEAD)
             \
              I--J--K   <-- branch
    

    then the current commit, whose snapshot we've had Git extract into our working tree, is that from commit H. We can now run:

    git cherry-pick branch
    

    to have Git compare J and K—Git finds the two hash IDs from the name branch and the metadata in commit K—and apply those same changes to our current commit H.

    Technically, this application actually uses Git's merge code. This merge can have merge conflicts, and if it does, Git will need to write to both Git's own staging area, and our working tree. In this case, Git will stop afterwards with a merge conflict error, and it will become our job to clean up the mess and finish the process. But if all goes well, Git will be able to do the merge on its own, and will then make an ordinary commit as usual. This ordinary commit will have H as its parent, and will get a new, unique hash ID. We could call it L, but since its effect is going to be the same as that for K, and its commit message will be taken from the commit message in K's metadata, I tend to call it K':

    ...--G--H--K'  <-- main (HEAD)
             \
              I--J--K   <-- branch
    

    Note that all of this requires that we have main checked out as the current branch, with commit H as the current commit when we start. When we are done, commit K' is the current commit and the name main selects commit K'.

    Using git worktree

    The question is whether it is possible to copy a commit when I am in the original branch.

    Let's say you have exactly the starting situation I drew a moment ago, but you're on branch and—having made K—you've already started work on what will be commit L soon:

    ...--G--H   <-- main
             \
              I--J--K   <-- branch (HEAD)
    

    To switch to main, you would need to commit your current work-so-far somewhere—perhaps using git stash (which makes commits) or perhaps just using git commit a little early (the method I'd use before git worktree existed)—and then you can git switch main to get onto main and begin the cherry-pick.

    To avoid derailing your work-so-far, though, you can avoid all of this using git worktree add to create a new working tree. This new working tree comes with its own new HEAD and index / staging-area. The new working tree is then populated, as if by git checkout or git switch, from some existing or new branch name (you choose whether to create a new branch name, or use some existing one, as part of the options to git worktree).

    So, assuming you are in the top level of your existing work-tree, you might run:

    git worktree add ../project.main main
    

    which would create ../project.main, enter that new empty directory, create a .git file that connects the new directory to your existing repository, and then populate that new working tree from the main branch. This new working tree is on the main branch. You can then create a new window, or just use the existing one:

    pushd ../project.main
    

    for instance (assuming bash or similar). A git status in this added work-tree will show that you are on branch main, while a git status in the default working tree will show that you are on branch branch.

    Commits you make in the added work-tree go into the (shared) repository, and update the branch name that you're on in that added work-tree. When you are done with the added work-tree, you can simply remove it entirely:

    popd
    rm -rf ../project.main
    

    and then run git worktree prune to make Git aware that the added work-tree is gone:

    git worktree prune
    

    (Current Git versions have git worktree remove to remove-and-prune an added work-tree in one step, but earlier Git versions lack this extra feature, requiring the two-step remove-and-prune dance. Be aware that while git worktree was new in Git 2.5, it has a nasty bug not fixed until Git 2.15: if you use a version of Git >= 2.5 but < 2.15 and use git worktree add, get all your added-work-tree work done within two weeks, and you will avoid being bitten by this bug.)