Search code examples
unicodegoogle-bigquerygoogle-cloud-storagegoogle-cloud-platformbyte-order-mark

BigQuery - create table via UI from cloud storage results in integer error


I am trying to test out BigQuery but am getting stuck on creating a table from data stored in google cloud storage. I am able to reduce the data down to just one value, but it is not making sense.

I have a text file I uploaded to google cloud storage with just one integer value in it, 177790884

I am trying to create a table via the BigQuery web UI, and go through the wizard. When I get to the schema definition section, I enter... ID:INTEGER

The load always fails with... Errors: File: 0 / Line:1 / Field:1: Invalid argument: 177790884 (error code: invalid) Too many errors encountered. Limit is: 0. (error code: invalid) Job ID trusty-hangar-120519:job_LREZ5lA8QNdGoG2usU4Q1jeMvvU Start Time Jan 30, 2016, 12:43:31 AM End Time Jan 30, 2016, 12:43:34 AM Destination Table trusty-hangar-120519:.onevalue Source Format CSV Allow Jagged Rows true Ignore Unknown Values true Source URI gs:///onevalue.txt Schema
ID: INTEGER

If I load with a schema of ID:STRING it works fine. The number 177790884 is not larger than a 64 bit signed int, I am really unsure what is going on. Thanks, Craig


Solution

  • Your input file likely contains a UTF-8 byte order mark (3 "invisible" bytes at the beginning of the file that indicate the encoding) that can cause BigQuery's CSV parser to fail.

    https://en.wikipedia.org/wiki/Byte_order_mark

    I'd suggest Googling for a platform-specific method for view and remove the byte order mark. (A hex editor would do.)