Search code examples
google-bigquerygoogle-cloud-storageorc

Load partitioned BigQuery table from partitioned ORC


I want to create a BigQuery partitioned table by mydate column from partitioned ORC.

Files in GCS :

mydate=2021-04-01/*.orc
...
mydate=2021-04-30/*.orc

Command bq:

bq load --source_format=ORC --time_partitioning_field mydate --time_partitioning_type DAY mydataset.mytable gs://mydata/*.orc

When I run this command I have this error : The field specified for partitioning cannot be found in the schema because mydate is not in ORC file.

How can I manage that?

Thanks for your help and have a nice day.


Solution

  • I think we can do this by Providing a custom partition key schema encoded via the source_uri_prefix field.

    Using below links and examples [1] & [2] related to Partition Schema detection modes, I think you can do it. [1] https://cloud.google.com/bigquery/docs/hive-partitioned-loads-gcs#command-line-tool [2] https://cloud.google.com/bigquery/docs/hive-partitioned-loads-gcs