Search code examples
databasedatabricksrollback

How can I rollback entire databricks catalog/database to a point in time


The scenario is: I've run some defective transformation code over numerous tables in a databricks catalog. I need to restore the whole catalog/database back to a point in time instead of restoring each table individually. It this functionality available in databricks?

I know you can do this easily in Snowflake but does databricks have same functionality?


Solution

  • Databricks saves versions at the table level, and can only restore tables, not the entire database.

    You can run the script below to restore all tables one by one:

    restore_timestamp = "2024-06-30 00:00:00"
    db_name = "dwh"
    tables = spark.sql(f"show tables in {db_name}").collect()
    for t in tables:
        table_name = t["tableName"]
        print(table_name)
        restore_sql = f"RESTORE TABLE {db_name}.{table_name} TO TIMESTAMP AS OF '{restore_timestamp}';"
        spark.sql(restore_sql)