Search code examples
pythonpandaspyodbcgoogle-sheets-apigoogle-api-python-client

How do I handle "Object of type 'Timestamp' is not JSON serializable" in Python / Pandas?


Preface: Extremely new at Python, but thankful for the SO help!

Below is a code snippet where I'm trying to execute a SQL query against a MSSQL server table, and post it back to Google Sheets. I am able to retrieve data and headers, and I think I almost have it figured out. However, I'm having some trouble with the datetime format that some of the columns have. The error I'm receiving is:

Traceback (most recent call last):
  File "modelhome.py", line 153, in <module>
    valueInputOption=value_input_option, insertDataOption=insert_data_option, body=value_range_body)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\discovery.py", line 785, in method
    actual_path_params, actual_query_params, body_value)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\model.py", line 151, in request
    body_value = self.serialize(body_value)
  File "C:\ProgramData\Anaconda3\lib\site-packages\googleapiclient\model.py", line 260, in serialize
    return json.dumps(body_value)
  File "C:\ProgramData\Anaconda3\lib\json\__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "C:\ProgramData\Anaconda3\lib\json\encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'Timestamp' is not JSON serializable

Code Snippet

"""Execute SQL Statement, create table, and append back to Google Sheet"""
# SQL Server Connection
server = '[SQLServerIP]'
database = '[SQLServerDatabase]'
username = '[SQLServerUsername]'
password = '[SQLServerPassword]'
cnxn = pyodbc.connect('Driver={ODBC Driver 13 for SQL Server};SERVER=' +
                      server+';DATABASE='+database+';UID='+username+';PWD='+password)

# Sample SQL Query to get Data
sql = 'select * from tblName'
cursor = cnxn.cursor()
cursor.execute(sql)
list(cursor.fetchall())

# Pandas reading values from SQL query, and building table
sqlData = pandas.read_sql_query(sql, cnxn)

# Pandas building dataframe, and exporting .xlsx copy of table
df = DataFrame(data=sqlData)

df.to_excel('tblName.xlsx',
            header=True, index=False)
dfHeaders = df.columns.values.tolist()
dfHeadersArray = [dfHeaders]
dfData = df.values.tolist()
dfDataFormatted = [dfData]
"""Writing to Google Sheet Range"""
print(dfHeaders)
print(dfData)

# How the input data should be interpreted.
value_input_option = 'USER_ENTERED'  # TODO: Update placeholder value.

# How the input data should be inserted.
insert_data_option = 'OVERWRITE'  # TODO: Update placeholder value.

value_range_body = {
    "majorDimension": "ROWS",
    "values":
    dfHeadersArray + dfDataFormatted
}

request = service.spreadsheets().values().append(spreadsheetId=spreadsheetId, range=SQLRangeName,
                                                 valueInputOption=value_input_option, insertDataOption=insert_data_option, body=value_range_body)
response = request.execute()

Within dfData, normal strings appear like this: enter image description here

datettime entries appear like this: enter image description here

My understanding is that the JSON doesn't have a native way to handle this data type, and it has to be handled as an exception. Is there a way to serialize all Timestamp parts of the dataset without having to specify which columns are datetime?

Any help/advice you all could provide would be greatly appreciated.

Thanks!

Final Solution Update - Credit: @chrisheinze

Adding the following dataframe modeling for datettime headers worked perfectly.

# Pandas reading values from SQL query, and building table
sqlData = pandas.read_sql_query(sql, cnxn)

# Pandas building dataframe, and exporting .xlsx copy of table
df = DataFrame(data=sqlData)

# Google Sheets API can't handle date/time. Below converts certain headers to formatted text strings.
df['Date'] = df['Date'].dt.strftime('%m/%d/%Y')
df['DateTime'] = df['DateTime'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['RDD'] = df['RDD'].dt.strftime('%m/%d/%Y')
df['DateTimeErrorTable'] = df['DateTimeErrorTable'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['DateTimeSuccessTable'] = df['DateTimeSuccessTable'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['WorkedOn'] = df['WorkedOn'].dt.strftime('%m/%d/%Y %H:%M:%S')
df['EmailSentOn'] = df['EmailSentOn'].dt.strftime('%m/%d/%Y %H:%M:%S')

Hope it helps someone else!


Solution

  • The Sheets API doesn't know what to do with a Python datetime/timestamp. You'll need to convert it - most likely to a str.

    For converting a pandas Series use pd.Series.dt.strftime()

    If it's just for a single value that needs to be converted then use datetime's strftime()

    Edit to answer your question in the comments:

    # To convert a datetime column to a str. 
    
    df['date_column'] = df['date_column'].dt.strftime('%Y%m%d%H%M%S')
    

    To give a bit more info, strftime means "string format datetime". This allows you to format your datetime/timestamp value into a str. The '%Y%m%d%H%M%S' is what you want the output the be. In my example, the results would be "20180309152303" for your date. Another example would be '%m/%d/%Y %H:%M:%S' which would give you "03/09/2018 15:23:03". So replace 'date_column' in my example with the name of your date column and it'll be converted to a str that's compatible with the API as well as understood format-wise in Google Sheets.