Search code examples
google-cloud-platformgoogle-bigquerydatasetcost-managementbackup-strategies

How to make a BigQuery data backup minimizing financial costs?


I'm helping to close a organization that is running out of business. One of the tasks is to make a backup of all our datasets in BigQuery for at least 5 year for legal purposes. Since this is quite a long time, I would like to minimize cost as much as possible.

Since we'll close our google workspace of the company we created a new backup account to transfer all data. We tried the following:

  • Export the datasets to Google Cloud Storage. This option only allowed in the same account. I couldn't find the option to make a migration to another account.
  • Transfer project to another account. This options seems the most easy one, but forces us to make a new paid subscription in the backup account. This is the option with the most financial cost.
  • Export all datasets to a local file and upload in another kind of storage. This is the most work intensive, since there's no way to export the datasets in batch. I think I could use DataFLow to make a data migration, but could not find any example on how to do this.

I'm glad to find new ideas. Thanks!


Solution

  • As suggested by Guillaume you can try to Export all the tables to Cloud Storage. Then, do what you want: set the class to archive to minimize cost, download the file, gzip them, and upload them where you want (Google Drive, Cloud Storage archive class),... In any case, Cloud Storage is the required point of passage for the next steps!

    Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future. Feel free to edit this answer for additional information.