Is it possible to create a table with a provided description string (for the table) using Apache Beam's WriteToBigQuery?
The additional_bq_parameters argument is useful to set, for example, the clustering or partitioning fields, but I cannot find a way to set table description here, nor in the schema object that is passed.
Is there any alternative way to do this (create a table with description / set table description) using the native Beam functions?
Setting table descriptions at the time of table creation is not directly supported by the apache_beam.io.gcp.bigquery.WriteToBigQuery transform. There isn't a parameter for specifying a description, however the schema parameter lets you specify the table schema. Setting a table description requires the following steps:
construct the table independently: Use the BigQuery API or the bq command-line tool to construct the BigQuery table prior to executing your Beam pipeline. This enables you to include a description when creating the table. This guarantees that the table is there before the Beam pipeline tries to write data. For more details refer to this documentation .
Utilize WriteToBigQuery
with CREATE_NEVER: In your Beam pipeline, utilize WriteToBigQuery
with beam.io.BigQueryDisposition.CREATE_NEVER
as the create_disposition argument. As a result, Beam will just publish data to the existing table rather than trying to create the table itself and refer to link1 and link2.