Search code examples
google-cloud-platformgoogle-cloud-spanner

External Backups/Snapshots for Google Cloud Spanner


Is it possible to snapshot a Google Cloud Spanner Database/table(s)? For compliance reasons we have to have daily snapshots of the current database that can be rolled back to in the event of a disaster: is this possible in Spanner? Is there intention to support it if not?

For those who might ask why we would need it as Spanner is replicated/redundant etc - it doesn't guard against human error (dropping a table by accident) or sabotage/espionage hence the question and requirement.

Thanks, M


Solution

  • Today, you can stream out a consistent snapshot by reading out all the data using your favorite tool (mapreduce, spark, dataflow) and reads at a specific timestamp (using Timestamp Bounds).

    https://cloud.google.com/spanner/docs/timestamp-bounds

    You have about an hour to do the export before the data gets garbage collected.

    In the future, we will provide a Apache Beam/Dataflow connector to do this in a more scalable fashion. This will be our preferred method for doing import/export of data into Cloud Spanner.

    Longer term, we will support backups and the ability to restore to a backup but that functionality is not currently available.