How do I get more than 500 codereviews with Gerrit REST Api?

I'm writing a python script to generate how many changes was made within a timeframe for all projects, but when I use the Gerrit REST Api I can only get up to maximum of 500 unique users, and I want to see all of them, even if I take long timeframe (1 year Gerrit picture). This is my function for the API

def requestAPICall(url):
    """
    does API stuff
    """
    response = requests.get(url)
    if response.status_code == 200:
        JSON_response = json.loads(response.text[4:])
        generateJSON(JSON_response)
        return (JSON_response, True)
    print("Error Occured")
    return (response, False)

This is the link I used for the request in this case https://chromium-review.googlesource.com/changes/?q=since:%222022-01-01%2011:26:25%20%2B0100%22+before:%222023-01-01%2011:31:25%20%2B0100%22

I have tried curl commands but I do not know if that works

Solution

There is a default limit on the number of returned items, and if you're making anonymous queries I don't believe you can change this. From the documentation:

The query string must be provided by the q parameter. The n parameter can be used to limit the returned results. The no-limit parameter can be used remove the default limit on queries and return all results (does not apply to anonymous requests). This might not be supported by all index backends.

However, you can return paginated resulted using the start parameter:

If the number of changes matching the query exceeds either the internal limit or a supplied n query parameter, the last change object has a _more_changes: true JSON field set.

The S or start query parameter can be supplied to skip a number of changes from the list.

So if the final result sets _more_changes: true, you can make a subsequent request using the start parameter.

That means your Python code is going to look something like:

import json
import requests
import sys


class Gerrit:
    """Wrap up Gerrit API functionality in a simple class to make
    it easier to consume from our code. This limited example only
    supports the `changes` endpoint.

    See https://gerrit-review.googlesource.com/Documentation/rest-api.html
    for complete REST API documentation.
    """

    def __init__(self, baseurl):
        self.baseurl = baseurl

    def changes(self, query, start=None, limit=None, options=None):
        """This implements the API described in [1].

        [1]: https://gerrit-review.googlesource.com/Documentation/rest-api-changes.html
        """

        params = {"q": query}
        if start is not None:
            params["S"] = start
        if limit is not None:
            params["n"] = limit
        if options is not None:
            params["o"] = options

        res = requests.get(f"{self.baseurl}/changes", params=params)
        print(f"fetched [{res.status_code}]: {res.url}", file=sys.stderr)
        res.raise_for_status()
        return json.loads(res.text[4:])


# And here is an example in which we use the Gerrit class to perform a
# query against https://chromium-review.googlesource.com. This is similar
# to the query in your question, but using a constrained date range in order
# to limit the total number of results.

g = Gerrit("https://chromium-review.googlesource.com")

all_results = []
start = 0
while True:
    res = g.changes(
        'since:"2022-12-31 00:00:00" before:"2023-01-01 00:00:00"',
        limit=200,
        start=start,
    )
    if not res:
        break

    all_results.extend(res)
    if not res[-1].get("_more_changes"):
        break
    start += len(res)

# Here we're just dumping all the results as a JSON document on
# stdout.
print(json.dumps(all_results))

This demonstrates how to use limit to control the number of queries returned in a "page", and the start parameter to request additional pages of results.

But look out! The example query here includes only a couple days and returns over 3000 results; I suspect that any attempt to fetch a year's worth of data, particularly with an anonymous connection, are going to run into some sort of server rate limits.