Search code examples
githubgraphqlgithub-api

How to use GitHub API to get a repository's dependents information in GitHub?


When I was using GitHub API v4 to get some information, I can easily get dependencies by using repository.dependencyGraphManifests. But I can't find any way to use GitHub API v4 to get the dependents information, though I can see it in the Insights->Dependency Graph->Dependents. I want to know if there is any possible way to get the dependents information in a GitHub repository? Whether GitHub API or something else.


Solution

  • I don't think you can get the dependents project using Github API (Rest or Graphql), one way could be to use scraping like the following script :

    import requests
    from bs4 import BeautifulSoup
    
    repo = "expressjs/express"
    page_num = 3
    url = 'https://github.com/{}/network/dependents'.format(repo)
    
    for i in range(page_num):
        print("GET " + url)
        r = requests.get(url)
        soup = BeautifulSoup(r.content, "html.parser")
    
        data = [
            "{}/{}".format(
                t.find('a', {"data-repository-hovercards-enabled":""}).text,
                t.find('a', {"data-hovercard-type":"repository"}).text
            )
            for t in soup.findAll("div", {"class": "Box-row"})
        ]
    
        print(data)
        print(len(data))
        paginationContainer = soup.find("div", {"class":"paginate-container"}).find('a')
        if paginationContainer:
            url = paginationContainer["href"]
        else:
            break
    

    Try this python script