Search code examples
pythongoogle-bigquerygoogle-cloud-datalab

How to debug parse error when inserting data to BigQuery from Google Cloud Datalab?


How can I debug a failure to insert data into BigQuery from Google Cloud Datalab?

This is my code, but it's throwing an error on the last line. aggregate_data is a Pandas dataframe with 8172 rows and 92 columns:

ds = 'calculations'
dataset = bq.DataSet(ds)
dataset.create()
schema = bq.Schema.from_dataframe(aggregate_data)
table_name = 'cost_ratios'
temptable = bq.Table(ds + '.' + table_name).create(schema=schema, 
                                                   overwrite=True)
temptable.insert_data(aggregate_data)

This is the error that is thrown:

RequestException                          Traceback (most recent call last)
<ipython-input-6-b905b654683e> in <module>()
     49 temptable = bq.Table(ds + '.' + table_name).create(schema=schema, 
     50                                                    overwrite=True)
---> 51 temptable.insert_data(aggregate_data)

/usr/local/lib/python2.7/dist-packages/gcp/bigquery/_table.pyc in insert_data(self, data, include_index, index_name)
    364           response = self._api.tabledata_insertAll(self._name_parts, rows)
    365         except Exception as e:
--> 366           raise e
    367         if 'insertErrors' in response:
    368           raise Exception('insertAll failed: %s' % response['insertErrors'])

RequestException: Parse Error

Looking in BQ, the table has been created with the correct schema, but there is no data in it.

How can I debug this further? The error above doesn't tell me much and I can't see anything in BigQuery.


Solution

  • My guess here is that there is data in the Dataframe that is not conforming to the Schema. The error is coming from BigQuery and I believe is due to it trying to parse a field based on the type specified in the spec but failing.

    Try catch that exception and print its 'content' property; that will give you the full response from BigQuery and may shed more light on the problem.