Search code examples
javajgit

JGit: RevWalk order overriding starting point


I'm using JGit for one of my project that involves intesive use of git.

My aim is to use a RevWalk to be able to iterate over the commits in a repository in a chronological order, starting at a specifc commit. I've managed to achieve both of them separetely:

  • Chronological order by applying a RevSort.REVERSE
  • Starting point by calling RevWalk.markStart(RevCommit c)

My problem is that when I try to combine the two, it seems that the RevSort overrides the markStart, and the RevWalk always ends up starting at the beginning of the commits instea of the commit that I've specifiied.

This snippet shows what I've got:

import org.eclipse.jgit.lib.Repository;
import org.eclipse.jgit.internal.storage.file.FileRepository;
import org.eclipse.jgit.revwalk.RevWalk;
import org.eclipse.jgit.revwalk.RevCommit;
import org.eclipse.jgit.revwalk.RevSort;

import java.io.IOException;
import org.eclipse.jgit.errors.AmbiguousObjectException;
import org.eclipse.jgit.errors.MissingObjectException;

public class Main {

    public static void main(String[] args) throws IOException, AmbiguousObjectException, MissingObjectException {
        final String repositoryPath = args[0];
        final String commitID = args[1];
        final Repository repository = new FileRepository(repositoryPath + "/.git");
        final RevWalk walk = new RevWalk(repository);
        walk.sort(RevSort.REVERSE);
        walk.markStart(walk.parseCommit(repository.resolve(commitID)));
        for (final RevCommit revCommit : walk) {
            System.err.println(revCommit.getId());
        }
    }

}

This should prints the ID of the repository in reverse order starting at the commit specified, but it just ignore the second setting and starts from the initial commit.

UPDATE:

I've investigated more in the problem and it turns out that when applying the two options together (in any order), the markStart becomes a markStop. I think that this is caused by markStart being always executed first and limiting the range of the commits (with a filter), and then having those reversed by the RevSort. Basically, the RevWalk is iterating on the complementary set of commits that I'm interested in.

Should I assume that what I'm trying to do is not obtainable in this way? I couldn't think of another way to get it without traversing the whole repository up to my starting point, but that sounds highly inefficient.

UPDATE 2: To give a proper example here is what I was expecting to achieve. Assume that we have a repository containing 4 commits: A, B, C and D. I'm interested only in the comments from B to the current revision, excluding A, in a chronological order. I was hoping to be able to use markStart and sort to achieve that in the following way:

@Test
public void testReverse2() throws Exception {
    final RevCommit commitA = this.git.commit().setMessage("Commit A").call();
    final RevCommit commitB = this.git.commit().setMessage("Commit B").call();
    final RevCommit commitC = this.git.commit().setMessage("Commit C").call();
    final RevCommit commitD = this.git.commit().setMessage("Commit D").call();

    final RevWalk revWalk = new RevWalk(this.git.getRepository());
    revWalk.markStart(revWalk.parseCommit(commitB));
    revWalk.sort(RevSort.REVERSE);

    assertEquals(commitB, revWalk.next());
    assertEquals(commitC, revWalk.next());
    assertEquals(commitD, revWalk.next());
    assertNull(revWalk.next());
    revWalk.close();
}

Now, from what I've seen, this doesn't work because markStart is always executed before the sort, so the actual behaviour satisfies the following test:

assertEquals(commitA, revWalk.next());
assertEquals(commitB, revWalk.next());
assertNull(revWalk.next());

That is the opposite of what I'm trying to obtain. Is this an intented behaviour and, if so, in what other way could I approach the problem?


Solution

  • In Git, commits have only links to their parent(s). commitB does not know about its successors commitC and commitD.

    Hence a history can only be traverse backwards, from a given commit to its parent, grand-pareents, etc. There is no information to traverse in the opposite direction.

    In your example the RevWalk will walk from commitB to commitÀ. The REVERSE sort will only affect how the iterator will behave but cannot walk forward.

    If you actually want to find the commits between commitB and HEAD, you will need to start at HEAD. Or, more general, you would need to start from all known branch tips to find the possibly multiple paths that lead to commitB.