Search code examples
pythonpandasgoogle-sheetsgoogle-sheets-apigspread

Problem with data format while Importing pandas DF from python into google sheets using df2gsheets


I'm using df2gspread to import a certain pandas df into google sheets. The process runs without any issues, but the numeric information which I'd like to manipulate within Gsheets is imported as text. When I use basic math operations with the data stored as text it works, but when I try to use Sheets functions such as sum, average and pretty much anything else, the value returned is always a zero. Also, if I try to manually convert text into numbers within gsheet itself, it doesn't have any effect.

The code is as follows:

import pandas as pd
import gspread as gs
from df2gspread import df2gspread as d2g

result = tera.execute_response("select * from table_drive")
df = pd.DataFrame(result)

scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name(
    'json_gsheets.json', scope)
gc = gs.authorize(credentials)

spreadsheet_key = 'insert_wks_key_here'
wks = 'import'
d2g.upload(df, spreadsheet_key, wks, credentials=credentials, row_names=False,start_cell = 'B3')

This inserts the data correctly, but everything is in there irrevocably as text.

Can anyone help?

Thanks in advance!


Solution

  • How about this answer?

    Issue

    When I saw the script of df2gspread, it seems that the method of upload uses the method of update_cells(). In this case, at gspread, the default value of "valueInputOption" is RAW. And df2gspread uses the default value. By this, the put number values have the single quote ' at the top character. I think that the reason of your issue is due to this.

    Here, in order to achieve your goal, I would like to propose the following 2 patterns.

    Pattern 1:

    In this pattern, the script of df2gspread is modified. Please modify the function of upload as follows. In the current stage, I think that there are 3 parts.

    From:

    wks.update_cells(cell_list)
    

    To:

    wks.update_cells(cell_list, value_input_option='USER_ENTERED')
    

    Pattern 2:

    In this pattern, the method of "values_update" in gspread is used.

    Sample script:

    import pandas as pd
    import gspread as gs
    from df2gspread import df2gspread as d2g
    
    result = tera.execute_response("select * from table_drive")
    df = pd.DataFrame(result)
    
    scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
    credentials = ServiceAccountCredentials.from_json_keyfile_name('json_gsheets.json', scope)
    
    gc = gs.authorize(credentials)
    spreadsheet_key = 'insert_wks_key_here'
    wks = 'import'
    spreadsheet = gc.open_by_key(spreadsheet_key)
    values = [df.columns.values.tolist()]
    values.extend(df.values.tolist())
    spreadsheet.values_update(wks, params={'valueInputOption': 'USER_ENTERED'}, body={'values': values})
    
    • You can see that USER_ENTERED is also used in this case.

    References: