Search code examples
pythonjsonapi

How to make a list of JSON objects into a JSON file in Python and join multiple ones?


I am working with an API that returns the following format:

{
    "count": 900,
    "next": "api/?data&page=2",
    "previous": null,
    "results": 
        [{json object 1}, {json object 2}, {...}]
}

Problem is that I want to retrieve all "results" from all pages, and save that into one json file.

I'm thinking of a while loop that keeps making requests to the API and aggregating the resulting "results" into one variable, until the "next" value is null.

Something like

while json1["next"] != null:
    r = request.get(apiURL, verify=False, allow_redirects=True, headers=headers, timeout=10)
    raw_data = r.json()["results"]

    final_data.update(raw_data)

I tried it but since r.json()["results"] is a list I don't know how to handle the different formats and transform that into a JSON file

When trying to do final_data.update(raw_data) it gives me an error saying:

'list' object has no attribute 'update'

Or when trying json.loads(raw_data) it gives me:

TypeError: the JSON object must be str, bytes, or bytearray, not list"

Solution

  • JSON file is a text file. To save your raw_data, which is a list, in a text file, you need to encode it using json.dumps():

    import json
    
    with open('output.json', 'w', encoding="utf-8") as f:
        raw_data_as_string = json.dumps(raw_data)
        f.write(raw_data_as_string)
    

    To aggregate the results from different pages, your final_data can be a list, created before you iterate the pages, and then you can final_data.extend(raw_data) in a loop, where raw_data contains results from a single page.

    After that you json.dumps(final_data) as shown earlier.