I'm using the BigQuery API for Python to insert data in a table with table.insert_data()
.
Sometimes a row will contain a None
value for a field that is set to REPEATED
mode in the table schema. When this happens, the following error is returned by the API:
[{'index': 48,
'errors': [
{u'debugInfo': u'generic::invalid_argument: Field value cannot be empty.',
u'reason': u'invalid',
u'message': u'Field value cannot be empty.',
u'location': u'name_of_my_field'}]}]
I would like these rows to be ignored. This is the prototype of the function i am using, that can also be found here:
insert_data(rows, row_ids=None, skip_invalid_rows=None, ignore_unknown_values=None, template_suffix=None, client=None)
And this is the parameter I am interested in because it seems to perfectly fit my needs:
skip_invalid_rows (bool) – (Optional) skip rows w/ invalid data?
It mirrors this parameter in the main API:
skipInvalidRows boolean [Optional] Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.
However by using the function with skip_invalid_rows=True
, the same errors are thrown.
Does this parameter mean what I think it means?
Worth pointing out:
skip_invalid_rows
in the doc?ignore_unknown_values=True
works fine for its own purpose.Any help much appreciated. :)
Maybe it's confusing but the error should still be thrown as a way to tell the end user that something didn't process as expected.
It does not mean though that correct rows are not being saved, they should be! If you check your table, you should see these rows being written normally there.
You can run a simple test like:
table.insert_data([('1', ['1', None, '2']), ('2', ['1', '2'])])
In this case, the value ('2', ['1', '2'])
should be saved normally.
I agree though on the docstrings
comments for this method not being that helpful. I myself was about to make a pull request adding some new features for the Table
resource but it's expected some major refactoring of this module in the near future so for now you might find these type of things there (and some other things that also might be confusing such as job
resource defined in the table
one).