Search code examples
pythonrubygoogle-bigquerygoogle-api-clientgoogle-api-ruby-client

BigQuery streaming insertAll appears to lose data - why?


Im trying to use the streaming insert_all method to insert data to a table using the google-api-client gem in ruby.

So I start with creating a new table in Bigquery (read and write priveleges are correct) with the following contents:

+-----+-----------+-------------+
| Row | person_id | person_name |
+-----+-----------+-------------+
|   1 |         1 | ABCD        |   
|   2 |         2 | EFGH        |   
|   3 |         3 | IJKL        |   
+-----+-----------+-------------+

This is my code in ruby: (I discovered earlier today that tabledata.insert_all is ruby for tabledata.insertAll - google docs / example need to be updated)

def streaming_insert_data_in_table(table, dataset=DATASET)
    body = {"rows"=>[ 
      {"json"=> {"person_id"=>10,"person_name"=>"george"}},
      {"json"=> {"person_id"=>11,"person_name"=>"washington"}}
    ]}

    result = @client.execute(
      :api_method=> @bigquery.tabledata.insert_all,
      :parameters=> {
          :projectId=> @project_id.to_s,
          :datasetId=> dataset,
          :tableId=>table},
      :body_object=>body,
    )
    puts result.body
end

So I run my code the first time and all appears fine. I see this in the table on Bigquery:

 +-----+-----------+-------------+
 | Row | person_id | person_name |
 +-----+-----------+-------------+
 |   1 |         1 | ABCD        |
 |   2 |         2 | EFGH        |
 |   3 |         3 | IJKL        |
 |   4 |        10 | george      |
 |   5 |        11 | washington  |
 +-----+-----------+-------------+

Then I change the data in the method to:

body = {"rows"=>[ 
      {"json"=> {"person_id"=>5,"person_name"=>"john"}},
      {"json"=> {"person_id"=>6,"person_name"=>"kennedy"}}
    ]}

Run the method and get this in Bigquery:

 +-----+-----------+-------------+
 | Row | person_id | person_name |
 +-----+-----------+-------------+
 |   1 |         1 | ABCD        |
 |   2 |         2 | EFGH        |
 |   3 |         3 | IJKL        |
 |   4 |        10 | george      |
 |   5 |         6 | kennedy     |
 +-----+-----------+-------------+

So, what gives? I've lost data.... (ids 11 and id 5 have vanished) The responses for the request do not have errors either.

Could someone tell me if Im doing something incorrectly or why this is happening please?

Any help is much appreciated.

Thanks and have a great day.


Solution

  • Discovered this appears something to do with the ui (row count doesn't populate for a while and trying to extract the data in the table results in an error "Unexpected. Please try again."). However data is actually stored and can be queried. Thanks for the help Jordan