Search code examples
pythonget

Shorten this code... Append JSON data from Multiple GET requests, then dump into single Excel File with Headers


I'm very new to Python.

The API I am using, is limited to 1000 results per request. I'm trying to get all the results/data for a specific date range, 1 month for this example.

Before I start, I run a GET request to see how many results would need to be exported in total.

For this example, the total = 14,841. I then create 15 queries/requests of 1000 each, using the "start" parameters of 1001, 2001, 3001, etc, and then take the response for each request and save that into a txt file containing the JSON data.

I am sure there is a way to shorten this code & process using Python. There are times when the Total is over 500,000 results, which forces me to do 500+ requests.

Below is an example of 5 requests / queries.

  1. How can I shorten this code to combine multiple queries & data into 1 file?

    import requests

    url = "https://api.leadspedia.com/core/v2/leads/getAll.do"

    headers = { 'Content-Type': "application/json", 'Authorization': "Basic XYZ" }

    querystring = {"fromDate":"2020-04-01","toDate":"2020-04-30","start":"0","limit":"1000"} querystring2 = {"fromDate":"2020-04-01","toDate":"2020-04-30","start":"1001","limit":"1000"} querystring3 = {"fromDate":"2020-04-01","toDate":"2020-04-30","start":"2001","limit":"1000"} querystring4 = {"fromDate":"2020-04-01","toDate":"2020-04-30","start":"3001","limit":"1000"} querystring5 = {"fromDate":"2020-04-01","toDate":"2020-04-30","start":"4001","limit":"1000"}

    response = requests.request("GET", url, headers=headers, params=querystring) response2 = requests.request("GET", url, headers=headers, params=querystring2) response3 = requests.request("GET", url, headers=headers, params=querystring3) response4 = requests.request("GET", url, headers=headers, params=querystring4) response5 = requests.request("GET", url, headers=headers, params=querystring5)

    file = open("out1.txt", "wb") file.write(response.content) file.close()

    file = open("out2.txt", "wb") file.write(response2.content) file.close()

    file = open("out3.txt", "wb") file.write(response3.content) file.close()

    file = open("out4.txt", "wb") file.write(response4.content) file.close()

    file = open("out5.txt", "wb") file.write(response5.content) file.close()


Solution

  • You can create a boolean indicating whether you are still obtaining new, valid results and a while loop using it:

    query_number = 1
    start = 0
    more_to_query = True
    
    while more_to_query:
      querystring={"fromDate":"2020-04-01","toDate":"2020-04-30","start":str(start),"limit":"1000"} 
      response = requests.request("GET", url, headers=headers, params=querystring) 
      file = open("out" + str(query_number) + ".txt", "wb")
      file.write(response.content) 
      file.close()
      start += 1000
      query_number += 1
      # set more_to_query to False when you detect you're no longer receiving new valid responses