Search code examples
pythonpandasgoogle-sheetsgoogle-sheets-apigspread

Appending data to a Google Sheet using Python


I have 3 different tables I'm looking to directly push to 3 separate tabs in a Google Sheet. I set up the GSpread connection and that's working well. I started to adjust my first print statement into what I thought would append the information to Tab A (waveData), but no luck.

I'm looking to append the information to the FIRST blank row in a tab. Basically, so that the data will be ADDED to what is already in there.

I'm trying to use append_rows to do this, but am hitting a "gspread.exceptions.APIError: {'code': 400, 'message': 'Invalid value at 'data.values' (type.googleapis.com/google.protobuf.ListValue).

I'm really new to this, just thought it would be a fun project to evaluate wave sizes in NJ across all major surf spots, but really in over my head (no pun intended).

Any thoughts?

import requests
import pandas as pd
import gspread
gc = gspread.service_account(filename='creds.json')
sh = gc.open_by_key('152qSpr-4nK9V5uHOiYOWTWUx4ojjVNZMdSmFYov-n50')
waveData = sh.get_worksheet(0)
tideData = sh.get_worksheet(1)
lightData = sh.get_worksheet(2)
# AddValue = ["Test", 25, "Test2"]
# lightData.insert_row(AddValue, 3)


id_list = [
    '/Belmar-Surf-Report/3683/',
    '/Manasquan-Surf-Report/386/',
    '/Ocean-Grove-Surf-Report/7945/',
    '/Asbury-Park-Surf-Report/857/',
    '/Avon-Surf-Report/4050/',
    '/Bay-Head-Surf-Report/4951/',
    '/Belmar-Surf-Report/3683/',
    '/Boardwalk-Surf-Report/9183/',

]

for x in id_list:
    waveData.append_rows(pd.read_html(requests.get('http://magicseaweed.com' + x).text)
                         [2].iloc[:9, [0, 1, 2, 3, 4, 6, 7, 12, 15]].to_json(), value_input_option="USER_ENTERED")
    # print(pd.read_html(requests.get('http://magicseaweed.com' + x).text)[0])
    # print(pd.read_html(requests.get('http://magicseaweed.com' + x).text)[1])

Solution

  • From your following reply,

    there really is no relationship between the 3. When I scrape with IMPORTHTML into Google sheets, those are just Tables at the locations 0,1, and 2. I'm basically just trying to have an output of each table on a separate tab

    I understood that you wanted to retrieve the values with pd.read_html(requests.get('http://magicseaweed.com' + x).text)[2].iloc[:9, [0, 1, 2, 3, 4, 6, 7, 12, 15]] from id_list, and wanted to put the values to a sheet in Google Spreadsheet.

    In this case, how about the following modification?

    At append_rows, it seems that JSON data cannot be directly used. In this case, it is required to use a 2-dimensional array. And, I'm worried about the value of NaN in the datafarame. When these points are reflected in your script, how about the following modification?

    Modified script 1:

    In this sample, all values are put into a sheet.

    gc = gspread.service_account(filename='creds.json')
    sh = gc.open_by_key('152qSpr-4nK9V5uHOiYOWTWUx4ojjVNZMdSmFYov-n50')
    waveData = sh.get_worksheet(0)
    
    id_list = [
        "/Belmar-Surf-Report/3683/",
        "/Manasquan-Surf-Report/386/",
        "/Ocean-Grove-Surf-Report/7945/",
        "/Asbury-Park-Surf-Report/857/",
        "/Avon-Surf-Report/4050/",
        "/Bay-Head-Surf-Report/4951/",
        "/Belmar-Surf-Report/3683/",
        "/Boardwalk-Surf-Report/9183/",
    ]
    
    # I modified the below script.
    res = []
    for x in id_list:
        df = pd.read_html(requests.get("http://magicseaweed.com" + x).text)[2].iloc[:9, [0, 1, 2, 3, 4, 6, 7, 12, 15]].fillna("")
        values = [[x], df.columns.values.tolist(), *df.values.tolist()]
        res.extend(values)
        res.append([])
    
    waveData.append_rows(res, value_input_option="USER_ENTERED")
    
    • When this script is run, the retrieved values are put into the 1st sheet as follows. In this sample modification, the path and a blank row are inserted between each data. Please modify this for your actual situation.

    Modified script 2:

    In this sample, each value is put into each sheet.

    gc = gspread.service_account(filename='creds.json')
    sh = gc.open_by_key('152qSpr-4nK9V5uHOiYOWTWUx4ojjVNZMdSmFYov-n50')
    
    id_list = [
        "/Belmar-Surf-Report/3683/",
        "/Manasquan-Surf-Report/386/",
        "/Ocean-Grove-Surf-Report/7945/",
        "/Asbury-Park-Surf-Report/857/",
        "/Avon-Surf-Report/4050/",
        "/Bay-Head-Surf-Report/4951/",
        "/Belmar-Surf-Report/3683/",
        "/Boardwalk-Surf-Report/9183/",
    ]
    
    obj = {e.title: e for e in sh.worksheets()}
    for e in id_list:
        if e not in obj:
            obj[e] = sh.add_worksheet(title=e, rows="1000", cols="26")
    
    for x in id_list:
        df = pd.read_html(requests.get("http://magicseaweed.com" + x).text)[2].iloc[:9, [0, 1, 2, 3, 4, 6, 7, 12, 15]].fillna("")
        values = [df.columns.values.tolist(), *df.values.tolist()]
        obj[x].append_rows(values, value_input_option="USER_ENTERED")
    
    • When this script is run, the sheets are checked and created with the sheet names of the values in id_list, and each value is put to each sheet.

    Reference: