Search code examples
gitgithub-api

What is the tree_id field in the GitHub REST API?


Four GitHub REST API endpoints (for example, GitHub Docs - List workflow runs for a repository) include a tree_id field within the head_commit object within the object for a workflow run.

I can't find documentation for it anywhere, not in the REST API docs, nor in the GraphQL docs. GitHub's docs search and Google also only return the documentation of those endpoints (and similar docs for some check suites endpoints). I speculate tree_id is some sort of hash of the entire contents of the files in the tree for that commit, and that seems to be what it's used for (for example, in fkirc/skip-duplicate-actions), but is that true?


Solution

  • It's not specific to GitHub's API - this has to do with plain-old-git. Those are the fields that are present in every commit. Commits in git are just tree objects + metadata. You can run git cat-file commit <rev> on any commit to see the raw commit, including the internally referenced tree field.

    Your speculation is very close. The tree_id is a hash of the entire contents (filenames, permissions, and blobs) of the tracked filesystem. The head_id is a hash of the commit object, which depends on the referenced tree and all of the commit metadata (timestamps, parent(s), authors, signatures, ...)

    See also 10.2 Git Internals - Git Objects from the git book for more on tree objects and commit objects