Search code examples
jsongoogle-bigqueryjavascript-objectsgoogle-cloud-dataflowuser-defined-functions

Failure to serialize json to table row during Dataflow Job to stream data to BigQuery


I'm using a Dataflow Job template to stream data from a Pub/Sub Subscription to BigQuery. From each JSON file I need to transform the values and output multiple table rows at once to a BQ table. A simplified version of the JSON message arriving to Pub/Sub is as follows:

{"a":{"k1":v1, "k2":v2}, "b":{"k1":v1, "k2":v2}...}

And the transformed JSON instead should look like:

[{"k1":v1, "k2":v2}, {"k1":v1, "k2":v2}...]

This is a simplification of the UDF I've created:

function transformToTableRows(inJson) {
  var input = JSON.parse(inJson);
  var output = [];
  for (var elem in input) {
    output.push({"k1": input[elem].k1, "k2": input[elem].k2})
  }
  return JSON.stringify(output);
}

Unfortunately this wouldn't work and will log the error "Failed to serialize json to table row". Any suggestion on how can I fix this?


Solution

  • As per the documentation the template is meant to output only a single table row per message. Thanks – 大ドア東