I'm scraping data off of Github via PyGithub. My issue is I receive this error during my scraping:
github.GithubException.GithubException: 403 {'documentation_url': 'https://developer.github.com/v3/#rate-limiting', 'message': 'API rate limit exceeded for XXXXX.'}
Upon curling the api I receive:
curl -i https://api.github.com/users/XXXXXX
HTTP/1.1 200 OK
Server: GitHub.com
Date: Thu, 14 Jul 2016 15:03:51 GMT
Content-Type: application/json; charset=utf-8
Content-Length: 1301
Status: 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 52
X-RateLimit-Reset: 1468509718
Cache-Control: public, max-age=60, s-maxage=60
Vary: Accept
Last-Modified: Wed, 08 Jun 2016 13:29:08 GMT
note the Ratelimit labels:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 52
X-RateLimit-Reset: 1468509718
If I run my Python program again I will get another API rate limit exceeded message. I read the API documentation for github and as far as I can tell - I still have 52 requests left over. If I can provide anymore information to make this better let me know. Thank you.
Edit: To clarify I am using credentials to login into github.
ORGANIZATION = "ORG"
PERSONAL_ACCESS_TOKEN = "TOKEN"
g = Github(PERSONAL_ACCESS_TOKEN, per_page = 100)
github_organization = g.get_organization(ORGANIZATION)
So the issue wasn't with my rate limit rather it was with the message the PyGithub wrapper was returning. I traced my error back and found this class in the source code : https://github.com/PyGithub/PyGithub/blob/master/github/Requester.py
Upon peaking into the __createException function I noticed this :
def __createException(self, status, headers, output):
if status == 401 and output.get("message") == "Bad credentials":
cls = GithubException.BadCredentialsException
elif status == 401 and 'x-github-otp' in headers and re.match(r'.*required.*', headers['x-github-otp']):
cls = GithubException.TwoFactorException # pragma no cover (Should be covered)
elif status == 403 and output.get("message").startswith("Missing or invalid User Agent string"):
cls = GithubException.BadUserAgentException
elif status == 403 and output.get("message").startswith("API Rate Limit Exceeded"):
cls = GithubException.RateLimitExceededException
elif status == 404 and output.get("message") == "Not Found":
cls = GithubException.UnknownObjectException
else:
cls = GithubException.GithubException
return cls(status, output)
Looking at the message of the exception I received I assumed it was the RateLimitExceededException.
However, looking at the actual exception itself, I noticed it was the GithubException.GithubException which looks to be a blanket exception if none of the other exceptions are triggered.
This answers my questions because it wasn't an API rate exceeded issue because I still had more requests left when i received this exception.
It's a non specific exception unfortunately. This answers my initial question for now.
Update: I was also curling the API without a token so it was not relaying me the correct info. With the token it shows that i did use up all my requests.