Search code examples
pythoncsvgoogle-sheets-apigspread

Uploading .csv to Google Sheets via Gspread


I am using gspread to refresh a worksheet in Google Sheets with data from a .CSV file. As much as I've looked around the web, I'm having a hard time finding what I think should be a clean answer to my question.

I have the .CSV file in my project directory. Here's my code thus far:

import gspread
from oauth2client.service_account import ServiceAccountCredentials
import csv

scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('client_secret.json', scope)
client = gspread.authorize(creds)
sheet = client.open_by_key('Spreadsheet Key')
worksheet = sheet.worksheet('Worksheet Name')
worksheet.clear()

At this point, I'm not sure what method I should use to mass-upload all the data in the .CSV. I've read that .update_cells() in gspread will only use a single call to the Google Sheets API and is the fastest method, so my question is this:

Using .update_cells(), how can I iterate through my .CSV to be able to post to Sheets?

Some info about my .CSV file is that it has 9 columns, but I need the code to handle any changes in the quantity of rows. Any help is greatly appreciated!


Solution

  • First of, if you are ok with the data going into A1 on the first spreadsheet, you can use gspread.Client.import_csv(), for a very simple option.

    Otherwise, read-on.

    To use update_cells(), pass it a list of Cell() objects.

    There are gspread objects that you get from worksheet.range(), worksheet.cell() , worksheet.find(), or worksheet.findall(). (Each of these functions makes a web call when you access them, so try to condense the number of calls.)

    Then, for each cell object, change the cell.value field, to your csv data.

    for data,cell in zip(csv_data,worksheet.range()): # Whatever range you need.
        cell.value=data # Sets the cell in *local* sheet to the specified data. (String)
    

    So this will change all of the local references for that cell to show your csv data. (It is worth pointing out that your data will be converted to a string when it is uploaded.

    If you want to read cell data as a number, I remember cell.numeric_value, showing up, though I see no reference in the docs.

    Then you would pass all of the now modified Cell items to update_cells() and then the google spread sheet would reflect your changes.

    You can see other reference material here: https://github.com/burnash/gspread#updating-cells

    Also, having had to tackle a very similar challenge in the past (json, not csv, but close enough), here is a nice helper function which will take a bunch of columns, and do the busy work of getting the cell object and then sending the update_cells() request. The code can be found at GitHub.com here.

    def update_columns(sheet, row, col, columns, execute = True):
        """Update the specified columns. Row and col are the starting most top left
           cell. Each column should be a list of values. Each list should be the
           same length.
        """
        # Step one, no columns is an error.
        if not columns:
            raise ValueError("Please specify at least one column to update.")
    
        # Otherwise, get that column length.
        r_len = len(columns[0])
        # First check that all columns are the same length.
        for column in columns[1:]:
            if len(column) != r_len:
                # Variable length.
                raise ValueError("Columns are of varying length.")
    
        # Start making lists.
        update_cells = []
    
        # Expand the sheet size if needed.
        if col + len(columns) > sheet.col_count:
            sheet.add_cols(col + len(columns) - sheet.col_count)
    
        if row + r_len > sheet.row_count:
           sheet.add_rows(row + r_len - sheet.row_count)
    
        # Get the range of cells to be updated.
        print("Range %s %s %s %s" % (row, col, row + r_len - 1 , col + len(columns) - 1))
        update_range = sheet.range (row, col, row + r_len - 1 , col + len(columns) - 1)
    
        for c, column in enumerate(columns):
    
            # Get the range on the sheet for the column.
    ##        column_range = sheet.range(row, col + c, row + len(column), col + c)
            column_range = (update_range[i] for i in range(c, len(update_range), len(columns)))
    
            for cell, value in zip(column_range, column):
                # Boolean rational.
                if isinstance(value, bool):
                    if str(value).upper() != cell.value:
                        # So its NOT the same.
                        cell.value = value
                        update_cells.append(cell)
    
                # Use numerical_value for numbers.
                elif isinstance(value, (int, float)):
                    # For whatever reason, it looks like gsheets
                    # truncates to the 10th place.
                    # It seems that 11th & 12th place is almost always correct but
                    # can actually differ slightly???
                    if cell.numeric_value is None or \
                       truncate(value, 10) != truncate(cell.numeric_value, 10):
                        cell.value = value
                        update_cells.append(cell)
    
                # And for everything else, string handling.
                elif isinstance(value, basestring):
                    if value != cell.value:
                        cell.value = value
                        update_cells.append(cell)
    
                # Handle None
                elif value is None:
                    if '' != cell.value:
                        # Set to ''
                        cell.value = ''
                        update_cells.append(cell)
    
                else:
                    # Other type, error.
                    raise ValueError("Cell value %r must be of type string, number, "
                                     "or boolean. Not %s." % (value, type(value)))
    
        # Now take the list of cells and call an update.
        if execute:
            print("Updating %d cells." % len(update_cells))
            if update_cells:
                sheet.update_cells(update_cells)
            return len(update_cells)
        else:
            return update_cells