Search code examples
gitcherry-pick

Find commits of a git branch, which were not cherry picked into another branch


I have two branches in git, where one branch master contains all commits, and another branch, e.g., release, which contains some cherry-picked commits from the first branch master. Since the commits are cherry-picked in release, they have different commit hashes than the corresponding commits in master, but the commit messages are the same.

Now I want to find commits from master, which were not cherry-picked into release. Note that the cherry-picked commits might be different in code from original commits due to conflict resolutions. How can I do it? Is there native support in git for this?

Example:

master branch:

git checkout master
git log --oneline -7

gives

2cba4b1d (HEAD -> master) Message subject for commit 7
f54fc16f Message subject for commit 6
4d871cbd Message subject for commit 5
a83ed44c Message subject for commit 4
48d0fb73 Message subject for commit 3
931da9a6 Message subject for commit 2
8553323b Message subject for commit 1

release branch

git checkout release
git log --oneline -5

gives

d65a04c6 (HEAD -> release) Message subject for commit 7
8aeecd92 Message subject for commit 6
2a54e335 Message subject for commit 4
99985f38 Message subject for commit 3
e76a9bb4 Message subject for commit 1

So the difference between the two branches will be two commits with message subjects:

Message subject for commit 5
Message subject for commit 2

It is also OK if it shows commit hashes:

4d871cbd Message subject for commit 5
931da9a6 Message subject for commit 2

Additional clarifications and requirements:

The above example returns the diff in the same order as commits were merged. Getting the same order in the result as in the original commit logs helps to identifier commits in the original commit log of master. It would be nice if it is possible to achieve too.

In my case both branches have linear history and there are no merge commits.


Solution

  • Your question is very similar to another one I read months ago about a way to identify rebased commits. Like with rebase, cherry-picking is about extracting the changes done in a commit and applying them to another commit. None of these commands keeps track of the original commit, there is just no need for git to differentiate the "copies", mainly because they could produce conflicts and the resulting commit would be different as you know.

    Fortunately, git gives us a great help with cherry-picked commits: the --cherry-pick option. I invite you to read the whole description (about --left/right-only too), but this is the interesting part:

    Omit any commit that introduces the same change as another commit on the “other side” when the set of commits are limited with symmetric difference.

    Seems promising, right? No, here is the problem: the same change as another commit. What if the cherry-picked commit is different after a conflict resolution? Git is not able to mark it as cherry-picked because they are not patch-equivalent anymore and this option is not enough. Starting from the easiest situation (which is not your case), where all the cherry-picked commits have been applied successfully to the other branch, you could solve with this:

    git log --format="%h %s" --cherry-pick --oneline --left-only --no-merges master...release 
    

    It is very well explained in the documentation, except for the concept of symmetric difference, in summary it takes all the commits on master that were not successfully cherry-picked in release.

    It is not perfect as I said, but at least we have a good starting point: now we just need to remove from this list all the commits whose commit message corresponds to the commit message of another commit in the release branch, finding the cherry-picked commits that produced a conflict. This is the only possible check you are left to do, excluding the reflog.

    Here the script (not fully tested):

    git log --format="%h %s" --cherry-pick --oneline --left-only --no-merges master...release |
    while read cmt_log 
    do
        cmt_msg=`echo "${cmt_log}" | awk '{ $1=""; print }'`
        git log --format=" %s" master..release | grep --fixed-string -s "${cmt_msg}" > /dev/null || echo ${cmt_log}
    done
    

    Basically, from the %h %s string I save the subject(%s) only, then I use it with grep to find the match if exists, otherwise I print it on stdout. I specified --fixed-string in the grep options just to be sure that the commit message is not interpreted as a regular expression, matching something that it should not, for instance.