Search code examples
gitautomationgithooksgit-fetchreproducible-research

Fetch refs not associated with a branch


I have a program that runs some scientific tests. I have written a git post-receive hook for the server that runs these tests:

  1. When it receives a commit, e.g. deadbeef..., run the program
  2. Commit the output of the program to no branch in particular
  3. Create a reference to the new commit refs/results/deadbeef...

Now I want to fetch this reference from the client. How can I do that?

Alternatively, how should I change my approach?

Git log

  * bot: add results from `make run`
 /
* Change experimental setup
|
| * bot: add results from `make run`
|/
* Add feature
|
| * bot: add results from `make run`
|/
* Initial commit

Motivation

The reason for having a leaf for each result is because output is often large and we do not want to keep a complete record of every output the program has ever generated. Eventually most results will be discarded, but for those that are retained, it should be easy to reproduce them. It should also be easy to, for a given source code commit, find the results if they are stored in Git.


For completeness, below is the post-receive hook:

post-receive

#!/bin/sh -e

# post-receive hook receives its arguments on stdin
# in case multiple branches are pushed at once, we use a while loop
while read oldrev newrev refname
do
    branch=$(git rev-parse --symbolic --abbrev-ref "${refname}")
    if [ "${branch}" = "run" ]; then
    cd ..
    unset GIT_DIR

        git checkout run

        # Enter detached-HEAD mode.
        # Commits made are not added to any particular branch.
        git checkout HEAD

    # In the event of error, we must checkout dummy so users can push to the run
    # branch (pushing is not allowed if the branch is checked out).
    make run || git checkout -B dummy

        git add --all
        git commit --message 'results from `make run`' \
           --author 'results bot <@>'

        # Add a reference so we can find the results commit using the hash of
        # the commit containing the code used to generate the results.
        git update-ref "refs/results/$(git rev-parse HEAD~)" HEAD

    # Also tag the latest results "results"
    git tag --delete results
    git tag results HEAD

        # Now, if we have some code in commit deadbeef..., its results can be
        # found in refs/results/deadbeef...  This ref will be transferred to
        # the user next time they run `git fetch` or `git pull`.

        # Finally, un-checkout run. Having it checked out will interfere with
    # users pushing code to it.
    git checkout -B dummy

    cd -
    fi
done

Solution

  • refs/results/deadbeef is a user-defined ref. To fetch and create such a local ref, you need to specify a proper refspec.

    git fetch origin refs/results/deadbeef:refs/results/deadbeef
    
    # Fetch all and create the corresponding local refs
    git fetch origin refs/results/*:refs/results/*
    

    If you don't use this refspec, you can also create it the way you do in the post-receive hook.

    git fetch origin refs/results/deadbeef
    # The fetched head is stored in FETCH_HEAD
    git update-ref refs/results/deadbeef FETCH_HEAD
    

    Besides, you can also fetch refs/results/deadbeef and create another ref, in one command, with a different name or of a different category like a branch (refs/heads/*), a tag (refs/tags/*), and another user-defined ref (refs/foo/*).

    git fetch origin refs/results/deadbeef:refs/foo/deadbeef