I've been toying around with gspread to push data to a Google Sheet and came up with a working script in python.
Quick background on the "job" this script has to do:
Now, as said, I have this working with the code underneath. But this does (at least to my understanding) uses 1 API Call per row it updates, but my sheet will have 1000+ rows with usernames, so this might end up using a lot of API calls. So I would prefer to do this in bulk. So temporarily store it and update all rows in 1 big push. From Gspread documentation I noticed that it should be possible if I can assign exact cell and column values, but I don't know how to build up my original input data to facilitate this.
My end user also occasionally inputs empty rows (god knows why..) but I noticed that my current logic fills in data here, which actually needs to go to the row below the empty one.
So I would love to get your input on how I can optimize this, and solve 2 'issues' with my current script:
Here's my code:
#import Google
import gspread
from oauth2client.service_account import ServiceAccountCredentials
#Setting up connection to Google Sheet and picking up the initial values
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('file_init.json',scope)
client = gspread.authorize(creds)
sheet = client.open('Workbookname').sheet1
pp = pprint.PrettyPrinter()
ig_username_column = 2
ig_data_column = 6
ig_usernames = sheet.col_values(ig_username_column)
ig_names = []
i = 2
t = 2
for user in ig_usernames:
ig_clean = remove_prefix(user,'@')
ig_names.append(ig_clean)
print(ig_names)
for name in ig_names[1:]:
if len(name) != 0:
print(name)
ig_url = f'https://www.instagram.com/{name}'
print(ig_url)
data = instagram_metrics(ig_url)
sheet.update_cell(i, ig_data_column, data[2])
i += 1
else:
i += 1 #this is here to skip over empty rows in the Sheet
continue
sleep(randint(3,6))
data[2]
of data = instagram_metrics(ig_url)
) from the row 2 of the column "F".18410
of (20, 'username', 18410, 937)
retrieved by instagram_metrics(ig_url)
to the Spreadsheet.If my understanding is correct, how about this modification? In this modification, requests
is created in the for loop. Then, request
is put to the Spreadsheet using the method of update_cells()
.
Please modify as follows.
From:for name in ig_names[1:]:
if len(name) != 0:
print(name)
ig_url = f'https://www.instagram.com/{name}'
print(ig_url)
data = instagram_metrics(ig_url)
sheet.update_cell(i, ig_data_column, data[2])
i += 1
else:
i += 1 #this is here to skip over empty rows in the Sheet
continue
sleep(randint(3,6))
To:
requests = []
for name in ig_names[1:]:
if len(name) != 0:
print(name)
ig_url = f'https://www.instagram.com/{name}'
print(ig_url)
data = instagram_metrics(ig_url)
requests.append(data[2])
else:
requests.append('')
continue
# Select a range
cell_list = worksheet.range('F2:F' + str(len(requests) + 1))
for i, cell in enumerate(cell_list):
cell.value = requests[i]
# Update in batch
worksheet.update_cells(cell_list)
sleep(randint(3,6))
might not be required.