Search code examples
google-cloud-platformgoogle-bigqueryexport-to-csv

How can export Bigquery table with spesific row count


I've exported bigquery table as a csv file into a bucket using this:

EXPORT DATA
 OPTIONS (
  uri = 'gs://export/test/*.csv',
  format = 'csv'
  overwrite = true
)
AS (
select * from table 
);

And It is exported as several files on the bucket, that's nice beacuse I use these files for some computanioal processes and It provides parallelism. But still file size are kinda big and I want to spesifty a limit for row count for each file. Is there any way to do that?

Is there a solution for file size limit in gcp doc: https://cloud.google.com/bigquery/docs/exporting-data#control_the_exported_file_size but it requires temp table to partion process


Solution

  • As @guillaume blaquiere mentioned in the comments:

    You can omit the temps table but it will be more expensive. The trick here is to perform an export data statement for each "partition" (or chunk). Do it on a partitioned table, or do it on the fly. As you wish.

    Posting the answer as community wiki as this is the BEST PRACTICE and for the benefit of the community that might encounter this use case in the future.

    Feel free to edit this answer for additional information.