Search code examples
pythonpython-3.xgoogle-sheets-apigspread

Need help optimizing gspread API calls


I've been toying around with gspread to push data to a Google Sheet and came up with a working script in python.

Quick background on the "job" this script has to do:

  • Open Google Sheet
  • Read Instagram usernames from column B
  • Collect data from Instagram
  • Fill in the data in column F

Now, as said, I have this working with the code underneath. But this does (at least to my understanding) uses 1 API Call per row it updates, but my sheet will have 1000+ rows with usernames, so this might end up using a lot of API calls. So I would prefer to do this in bulk. So temporarily store it and update all rows in 1 big push. From Gspread documentation I noticed that it should be possible if I can assign exact cell and column values, but I don't know how to build up my original input data to facilitate this.

My end user also occasionally inputs empty rows (god knows why..) but I noticed that my current logic fills in data here, which actually needs to go to the row below the empty one.

So I would love to get your input on how I can optimize this, and solve 2 'issues' with my current script:

  • reduce the number of API calls
  • properly handle empty rows

Here's my code:

#import Google 
import gspread
from oauth2client.service_account import ServiceAccountCredentials

#Setting up connection to Google Sheet and picking up the initial values
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('file_init.json',scope)
client = gspread.authorize(creds)
sheet = client.open('Workbookname').sheet1
pp = pprint.PrettyPrinter()

ig_username_column = 2
ig_data_column = 6
ig_usernames = sheet.col_values(ig_username_column)
ig_names = []
i = 2
t = 2
for user in ig_usernames:
    ig_clean = remove_prefix(user,'@')
    ig_names.append(ig_clean)
    print(ig_names)


for name in ig_names[1:]:
    if len(name) != 0:
        print(name)
        ig_url = f'https://www.instagram.com/{name}'
        print(ig_url)
        data = instagram_metrics(ig_url)
        sheet.update_cell(i, ig_data_column, data[2])
        i += 1
    else:
        i += 1 #this is here to skip over empty rows in the Sheet
        continue

    sleep(randint(3,6))

Solution

    • You want to put the values (data[2] of data = instagram_metrics(ig_url)) from the row 2 of the column "F".
    • For example, you want to put the value of 18410 of (20, 'username', 18410, 937) retrieved by instagram_metrics(ig_url) to the Spreadsheet.
    • You want to put all values of "500 rows" to the Spreadsheet by one API call.

    If my understanding is correct, how about this modification? In this modification, requests is created in the for loop. Then, request is put to the Spreadsheet using the method of update_cells().

    Modified script:

    Please modify as follows.

    From:
    for name in ig_names[1:]:
        if len(name) != 0:
            print(name)
            ig_url = f'https://www.instagram.com/{name}'
            print(ig_url)
            data = instagram_metrics(ig_url)
            sheet.update_cell(i, ig_data_column, data[2])
            i += 1
        else:
            i += 1 #this is here to skip over empty rows in the Sheet
            continue
    
        sleep(randint(3,6))
    
    To:
    requests = []
    for name in ig_names[1:]:
        if len(name) != 0:
            print(name)
            ig_url = f'https://www.instagram.com/{name}'
            print(ig_url)
            data = instagram_metrics(ig_url)
            requests.append(data[2])
        else:
            requests.append('')
            continue
    
    # Select a range
    cell_list = worksheet.range('F2:F' + str(len(requests) + 1))
    
    for i, cell in enumerate(cell_list):
        cell.value = requests[i]
    
    # Update in batch
    worksheet.update_cells(cell_list)
    

    Note:

    • I think that in this modification, sleep(randint(3,6)) might not be required.
    • This modified script supposes that you have already been able to write and read values using Sheets API.

    Reference: