Search code examples
gitgitlabbranchcommitgit-merge

What are merge request versions in GitLab and is that a specific GitLab feature?


In gitlab if I have an open pull request with some commits on it. On the changes panel I can see this dropdown with a version history.

gitlab MR history

gitlab commit hashes

These versions seem to be related to the branch or merge request itself, since I can see versions pop up when I add new commits or when I force push an amended commit (which seems weird since force pushes are supposed to rewrite the branch history)

git commit --amend --no-edit
git push origin branch --f

But I cannot find any of the version refs in my local repository, besides the latest one:

git log 1d2ee59b

gives me the log, but

git log 1c7b76e4
git log 09dc0bb8

throws unknown revision or path not in the working tree.

So I'm wondering, is this a feature in GitLab (like some kind of custom reflog related to PRs/Branches, or is it a git feature I don't know about or understand?


Solution

  • This is not a Git feature. Which means that it must be a GitLab feature.

    Yes, force pushes do rewrite branches. But let's just consider git-rebase(1) (local operation): a rebase might make the old commits unreachable from that branch. Then they are only stored in the reflog. And eventually the reflog expires and the commits are garbage-collected. But there is nothing stopping you from hooking into the rewriting machinery and storing refs to the old tips of the branch on every rebase. Maybe like this:

    refs/branch-history/<branch-name>/1
    refs/branch-history/<branch-name>/2
    refs/branch-history/<branch-name>/3
    refs/branch-history/<branch-name>/4
    

    Where you did your first rebase on “1” and so on. Now the commits from the old branch are reachable again (via those refs) and thus will never get garbage collected.

    GitHub in fact keeps all ref writes by writing them to an audit log:

    Yeah, we face this problem at GitHub. We actually write every single ref write to $GIT_DIR/audit_log, which is essentially a reflog with the refname prepended. The key, though, is that it isn't ever read by git for reachability. So it becomes an immutable log of what happened, and we can happily prune the reflog to drop objects.

    And now (since Git 2.42) you can use gc.recentObjectsHook in conjunction with such a file in order to keep all commits from all previous versions alive, effectively implementing what GitLab already does. You do that by supplying a hook program which prints all the commits that should be kept alive. (Giving the commit that was at the tip of the branch will keep all commits reachable from that commit alive as well.) And if you have an “audit log” on that format then your program simply needs to print the file.