Search code examples
githubgithub-api

Get all file names from a Github repo through the Github API


Is it possible to get all the file names from repository using the GitHub API?

I'm currently trying to tinker this using PyGithub, but I'm totally ok with manually doing the request as long as it works.

My algorithm so far is:

  1. Get the user repo names
  2. Get the user repo that matches a certain description
  3. ??? get repo file names?

Solution

  • This will have to be relative to a particular commit, as some files may be present in some commits and absent in others, so before you can look at files you'll need to use something like List commits on a repository:

    GET /repos/:owner/:repo/commits
    

    If you're just interested in the latest commit on a branch you can set the sha parameter to the branch name:

    sha string SHA or branch to start listing commits from.

    Once you have a commit hash, you can inspect that commit

    GET /repos/:owner/:repo/git/commits/:sha
    

    which should return something like this (truncated from GitHub's documentation):

    {
      "sha": "...",
      "...",
      "tree": {
        "url": "https://api.github.com/repos/octocat/Hello-World/git/trees/691272480426f78a0138979dd3ce63b77f706feb",
        "sha": "691272480426f78a0138979dd3ce63b77f706feb"
      },
      "...": "..."
    }
    

    Look at the hash of its tree, which is essentially its directory contents. In this case, 691272480426f78a0138979dd3ce63b77f706feb. Now we can finally request the contents of that tree:

    GET /repos/:owner/:repo/git/trees/:sha
    

    The output from GitHub's example is

    {
      "sha": "9fb037999f264ba9a7fc6274d15fa3ae2ab98312",
      "url": "https://api.github.com/repos/octocat/Hello-World/trees/9fb037999f264ba9a7fc6274d15fa3ae2ab98312",
      "tree": [
        {
          "path": "file.rb",
          "mode": "100644",
          "type": "blob",
          "size": 30,
          "sha": "44b4fc6d56897b048c772eb4087f854f46256132",
          "url": "https://api.github.com/repos/octocat/Hello-World/git/blobs/44b4fc6d56897b048c772eb4087f854f46256132"
        },
        {
          "path": "subdir",
          "mode": "040000",
          "type": "tree",
          "sha": "f484d249c660418515fb01c2b9662073663c242e",
          "url": "https://api.github.com/repos/octocat/Hello-World/git/blobs/f484d249c660418515fb01c2b9662073663c242e"
        },
        {
          "path": "exec_file",
          "mode": "100755",
          "type": "blob",
          "size": 75,
          "sha": "45b983be36b73c0788dc9cbcb76cbb80fc7bb057",
          "url": "https://api.github.com/repos/octocat/Hello-World/git/blobs/45b983be36b73c0788dc9cbcb76cbb80fc7bb057"
        }
      ]
    }
    

    As you can see, we have some blobs, which correspond to files, and some additional trees, which correspond to subdirectories. You may want to do this recursively.