I want to periodically transfer data from Google datastore to BigQuery in an automated way.In first go it should transfer everything and from second time it should keep datastore and bigquery in sync.
Till now I have found 1 script - https://github.com/chees/datastore2bigquery/blob/master/datastore2bigquery.sh - Can somebody help with the cost calculation considering data in datastore is 100 MB. Is this a cost effective method or some other better method exist?
Can I use Dataflow/google infra only instead of me maintaining a cron job or jenkins CI?
As @guillaume blaquiere mentioned in comments:
Script are the best way (export datastore to Storage and then load storage to BigQuery). But it's only async
, you don't have realtime replication option.
About cost, you pay only the storage: datastore export and bigquery load jobs are free.
Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future.
Feel free to edit this answer for additional information.