Search code examples
pythonpandasgoogle-sheetsgoogle-sheets-apigspread

GSPREAD bulk export of dataframes on separate sheets without triggering API write request limits


I am running up against the gsheet api quota for write requests when writing some relatively small dataframes to a single spreadsheet.

The dataframes are held in a dictionary. The KEY is used for the SHEET name and the VALUE is the dataframe to be exported.

Right now I'm running through the dataframes in a loop, creating the SHEET and saving the data.

gc = gspread.oauth(credentials_filename=credential_path("credentials.json"),
                   authorized_user_filename=credential_path("token.json")
                   )

gc_sh = gc.create('rote_output_' + yyyymmddhhss())


# (rote_data is the dictionary of dataframes)
for k, v in rote_data.items():
    gc_sh.add_worksheet(title=k, rows=100, cols=20)
    worksheet = gc_sh.worksheet(k)
    worksheet.update([v.columns.values.tolist()] + (v.fillna('')).values.tolist())
    worksheet.freeze(rows=1)

It feels like the four 'operations' in the loop are being treated as separate calls and as such, if I have 15 dataframes, I hit 60 write requests instantly, triggering the api block.

Is there a way to conduct the updates as a 'bulk' request so as not to hit these limits? The batch_update() documentation is light on detail and I'm not sure that it is possible to create and then save down to different sheets using this formula, which would be ideal.

** EDIT: How to create multiple sheets in a spreadsheet by gspread and python?

From this question/answer it appears that batch_update might be able to do this although it's not clear how I would combine the request to create a sheet, with the actual updating of the data on each sheet

client =  # Please use your client.

employee_names = ["Jonny", "Emma",,,] # Please set the sheet names. spreadsheetId = "###" # Please set your spreadsheet ID.

requests = [
    {
        "addSheet": {
            "properties": {
                "title": e,
                "gridProperties": {"rowCount": 100, "columnCount": 20},
            }
        }
    }
    for e in employee_names ]  
spreadsheet = client.open_by_key(spreadsheetId)
spreadsheet.batch_update({"requests": requests})

Solution

  • I believe your goal is as follows.

    • You want to create a new Google Spreadsheet.
    • You want to create 60 sheets in the new Spreadsheet. Each sheet has 100 rows and 20 columns and you want to freeze the 1st row in each sheet.
    • You want to put the values rote_data to each sheet.
    • You want to reduce the process cost.

    In this case, how about the following modification?

    Modified script:

    gc_sh = gc.create('rote_output_' + yyyymmddhhss())
    
    # --- I modified the below script.
    # Create request bodies.
    request1 = []
    request2 = []
    for k, v in rote_data.items():
        request1.append({"addSheet": {"properties": {"title": k, "gridProperties": {"rowCount": 100, "columnCount": 20, "frozenRowCount": 1}}}})
        request2.append({"range": f"'{k}'!A1", "values": [v.columns.values.tolist()] + (v.fillna('')).values.tolist()})
    
    # Use Sheets API.
    gc_sh.batch_update({"requests": request1})
    gc_sh.values_batch_update({"data": request2, "valueInputOption": "USER_ENTERED"})
    
    • In this modification, 2 sheet APIs are used with request1 and request2. At first request, new sheets are inserted. And, at the second request, the values are put into each sheet.

    Note:

    • In the case of Sheets API, when Method: spreadsheets.create is used, gc_sh = gc.create('rote_output_' + yyyymmddhhss()) and gc_sh.batch_update({"requests": request1}) can be merged by one API call. But, in the current stage, it seems that gspread has no method for using "Method: spreadsheets.create". So, I proposed the above modification.
    • I couldn't know your actual values of rote_data. So, in order to use "valueInputOption": "USER_ENTERED", I used values_batch_update.

    References: