libgit2 - cherry pick multiple commits

I am looking for a method to cherry pick two or more commits.

My goal is to be able to cherry pick multiple commits to allow a user to review those changes before committing them, and not requiring users to commit after each cherry pick.

I've added below a code snippet that will accept a repository path, followed by two commits and try to cherry pick them consecutively. However I'm not certain what options I need to set to allow two commits to be cherry picked.

As is the first cherry pick works, but the 2nd fails with 1 uncommitted change would be overwritten by merge

I had tried using the option GIT_CHECKOUT_ALLOW_CONFLICTS but was not successful. What options are needed to allow for cherry picking multiple commits?

#include <stdio.h>
#include "git2.h"

#define onError(error, errorMsg)\
if (error){\
    const git_error* lg2err = giterr_last();\
    if (lg2err){\
        printf("%s %s\n", errorMsg, lg2err->message);\
        return 1;\
    }\
}

int main(int argc, char* argv[])
{

    if(argc != 4) { printf("Provide repo commit1 commit2\n"); return 1;}

    printf("Repository: %s\n Commit1: %s\n Commit2: %s\n", argv[1], argv[2], argv[3]);

    int error;
    git_libgit2_init();
    git_repository * repo;
    git_oid cid1, cid2;
    git_commit *c1 =NULL;
    git_commit *c2 =NULL;

    error = git_repository_open(&repo, argv[1]);
    onError(error,"Repo open failed: ");

    git_cherrypick_options cherry_opts = GIT_CHERRYPICK_OPTIONS_INIT;

    git_oid_fromstr(&cid1, argv[2]);
    git_oid_fromstr(&cid2, argv[3]);
    error = git_commit_lookup(&c1, repo, &cid1);
    onError(error,"commit lookup failed: ");
    error = git_commit_lookup(&c2, repo, &cid2);
    onError(error,"commit2 lookup failed: ");

    error = git_cherrypick(repo, c1, &cherry_opts);
    onError(error,"cherry1 failed: ");
    error = git_cherrypick(repo, c2, &cherry_opts);
    onError(error,"cherry2 failed: ");

    return 0;
}

Solution

What's happening is that libgit2 is refusing to overwrite a file on disk that has been modified, but its contents have not actually been stored anywhere by git. This file is "precious", and git and libgit2 will take great pains to avoid overwriting it.

There's no way to overcome this because cherry-picking is not applying the differences in the commit based on your working directory contents. It's applying the differences in the commit to HEAD. That is to say that your only options would be to ignore the changes in this cherry-pick or to overwrite the changes that the previous cherry-pick introduced.

Let me give you a concrete example:

Suppose that you have some file at commit 1:

one
two
three
four
five

And you have some commit based on 1 (let's call it 2), that changes the file to be:

one
2
three
four
five

And you have still another commit in a different branch. It's also based on 1 (let's call it 2'). It changes the file to be:

one
two
three
4
five

What happens if you are on commit 1 and cherry-pick both 2 and 2' without committing? Logically, you might expect it to do a merge! But it will not.

If you're on commit 1 and you git_cherrypick for commit 2 in libgit2 (or git cherry-pick --no-commit on the command line) for the first commit, it will read the file out of HEAD, and apply the changes for commit 2. This is a trivial example, so the contents are, literally, matching the contents of commit 2. That file will be placed on disk.

Now, if you do nothing else - you don't commit this - then you're still on commit 1. And if you again do a git_cherrypick (this time for commit 2') then libgit2 will read the file out of HEAD and apply the changes for commit 2'. And again, in this trivial example, applying the changes in 2' to the file in 1 gives you the contents of the file in commit 2'.

Because what it won't do is read the file out of the working directory.

So now when it goes to try to write those results to the working directory, there's a checkout conflict. Because the contents of the file on disk don't match the value of the file in HEAD or in what we're trying to checkout. So you're blocked.

What you probably want to do is create a commit at this stage. I know you said that you wanted t avoid "requiring users to commit after each cherry pick". But there's a difference between creating a commit object in libgit2 which is lightweight and can be discarded easily (where it will be garbage collected eventually) and doing the moral equivalent of running git commit which updates a branch pointer.

If you merely create a commit and write it into the object database - without switching to it or checking it out - then you can reuse that data for other steps in your work without ever giving the user the appearance of having done a commit. It's entirely in memory (and a little bit in the object database) without ever hitting the working directory.

What I'd encourage you to do is to cherry-pick each commit that you want into an index, which does its work in-memory and doesn't touch the disk. When you're happy with the results, you can create a commit object. You'll need to use the git_cherrypick_commit API instead of git_cherrypick to produce an index, then turn that into a tree for . For example:

git_reference *head;
git_signature *signature;
git_commit *base1, *base2, *result1;
git_index *idx1, *idx2;
git_oid tree1;

/* Look up the HEAD reference */
git_repository_head(&head, repo);
git_reference_peel((git_object **)&base1, head, GIT_OBJ_COMMIT);

/* Pick the first cherry, getting back an index */
git_cherrypick_commit(&idx1, repo, c1, base1, 0, &cherry_opts);

/* Write that index into a tree */
git_index_write_tree(&tree_id1, idx1);

/* And create a commit object for that tree */
git_signature_now(&signature, "My Cherry-Picking System", "[email protected]"); 
git_commit_create_from_ids(&result_id1,
    repo,
    NULL, /* don't update a reference */
    signature,
    signature,
    NULL,
    "Transient commit that will be GC'd eventually.",
    &tree_id1,
    1,
    &cid1);
git_commit_lookup(&result1, repo, &result_id1);

/* Now, you can pick the _second_ cherry with the commit you just created as a base... */
git_cherrypick_commit(&idx2, repo, c1, result1, 0, &cherry_opts);

Eventually you'll get your terminal commit and you can just check it out - and I mean that in the libgit2 git_checkout notion of checking out, which just puts those contents in your working directory. Still, don't update any branch pointers. This will give the result where files are only modified in the working directory (and index) but the user has not committed anything, their HEAD has not moved.

git_checkout_tree(repo, final_result_commit, NULL);

(You can pass a git_commit * to git_checkout_tree. It knows what to do.)

I could have made this a lot easier for you by giving you a git_cherrypick_tree API. This would let you cut out the middleman of creating a commit that you don't need. But I didn't think that anybody would want to do this. (Sorry!)

The reason that I didn't think that anybody would want to do this is because what you're describing is more accurately called rebase. Rebase is a sequenced set of patch application or cherry-pick steps. (Interactive rebase is a bit more involved, so let's ignore that for now.)

libgit2 has a git_rebase machinery that can work entirely in-memory, saving you some of the bookkeeping involved in converting indexes to trees and writing commits to disk. It can be invoked to work completely in-memory (see rebase_commit_inmemory) which may help you here.

In either case, the end result is largely the same, a series of commits that were written into the object database without the user ever knowing about it, and updating their working directory to match at the end.