Search code examples
gitgitpythongit-detached-head

gitpython get last commit with detached head


In a up-to-date repository we do the following:

git checkout HEAD~5

Then using GitPython, we can obtain the head commit, which is detached:

import git

repo = git.Repo('.')
head = repo.head
head_commit = head.commit
print(head.is_detached)
> True

Is there a way in GitPython to obtain the last commit in the branch?

I'm thinking of something along the lines of:

last = repo.active_branch.last_commit  # active_branch will throw an error when head is detached.

or

last = head_commit
# I dont even know if talking about child commits makes sense in git.
while last.child is not None:
    last = last.child

Solution

  • The last commit in any branch is given by the commit hash ID stored in the branch name.

    That's all—it really is that simple. Git commits link only backwards, from child to parent. So each branch name must store the hash ID of its last commit.

    In a up-to-date repository we do the following:

    git checkout HEAD~5
    

    After doing this, you have, as you said in the title, a "detached HEAD". This means you are no longer on any branch. The question of "last commit" becomes kind of meaningless: HEAD points directly to a commit, and that commit is by definition the current commit. You may check out any commit you like, e.g., by giving git checkout a raw commit hash ID, and you will continue to be on a detached HEAD, but on a different commit.

    The effect is that you are now on an anonymous branch (one with no name) whose last commit is the current commit. Creating a new branch at this point, pointing to this commit, causes that branch to come into existence. Its last commit is the current commit, so now the last commit on the newly created branch is the current commit.

    Given a chain of commits like:

    ... <-F <-G <-H <-I   <-- br1
                       \
                        J <-K   <-- br2
    

    all the commits up through I are on br2, even though I is also the last commit of br1 so that all the commits up through I are on br1. They are on both branches. Commits J and K are only on br2.

    If we delete the name br1, all the commits continue to be findable via the name br2. Git finds (all) commits by reading each hash ID from (all) branch names and other such names, then working backwards from those commits to their parents, and to the parents' parents, and so on.

    If at this point you git checkout br1 and create one new commit, you get:

                        L   <-- br1 (HEAD)
                       /
    ... <-F <-G <-H <-I
                       \
                        J <-K   <-- br2
    

    Note that HEAD is now attached to branch br1. Commit L is the last commit in this branch; commits through I are in both branches.

    If you now detach HEAD and move it to commit I you get:

                        L   <-- br1
                       /
    ... <-F <-G <-H <-I   <-- HEAD
                       \
                        J <-K   <-- br2
    

    Exercises (do these in order)

    1. I will give you two raw commit hash IDs, with the provision that the first commit's hash ID will be reachable by starting at the second commit and walking backwards, one step at a time along these backward-pointing arrows. For instance, I might give you the hash ID of commit F and the hash ID of commit K. How can you use this information to find commit G? (Think about starting at commit K and following each arrow while keeping some sort of log of each commit visited.)

    2. What's the next commit after commit I? What other information do you need in order to find this next commit? Remember that commit I is on both br1 and br2.