Search code examples
google-bigqueryscheduled-tasksbackupsnapshotservice-accounts

BigQuery Scheduled Snapshots Work Only on Initial Run. Why?


I have recently implemented Scheduled Snapshots as suggested by Googles Guide.

The initial run went smooth and well, however further executions fail with the following error message:

Access Denied: Table PROJECT:BACKUP.TABLE_20230315: Permission bigquery.tables.deleteSnapshot denied on table PROJECT:BACKUP.TABLE_20230315 (or it may not exist). at [1:1]; JobID: 000000000:scheduled_query_000000000000-000-etc

I googled the error message, as the appropriate permissions should be set, but without success.

The table, as the error message suggests does not exist yet, obviously, as the schedule schould create it (that's what the schedule is for, right?).

I must admit that I have never worked with Snapshots before so I apologise if the question is too nooby.

Essentially I followed the Guide which seemed to suggest that this approach is correct.

For all I know I implemented it marticulously with the only change that I used locaton EU instead of US and I changed the schedule to "every 24 hours" instead of "1 of month 05:00"

Just to recap: I created a service account with Data Editor Rights on my Backup Data Set and BigQuery User for the whole project.

Then I used the Command Line Command described in the guide:

bq query --use_legacy_sql=false --display_name="Daily snapshots of the TABLE table" \
--location="EU" --schedule="every 24 hours" \
--project_id=PROJECT \
'DECLARE snapshot_name STRING;
DECLARE expiration TIMESTAMP;
DECLARE query STRING;
SET expiration = DATE_ADD(@run_time, INTERVAL 40 DAY);
SET snapshot_name = CONCAT("BACKUP.TABLE_",
  FORMAT_DATETIME("%Y%m%d", @run_date));
SET query = CONCAT("CREATE SNAPSHOT TABLE ", snapshot_name,
  " CLONE PROJECT.DATASET.TABLE OPTIONS(expiration_timestamp=TIMESTAMP \"",
  expiration, "\");");
EXECUTE IMMEDIATE query;'

And deployed the query as described in the guide:

bq update --transfer_config --update_credentials \
--service_account_name=snapshot-bot@PROJECT.iam.gserviceaccount.com \
projects/12345/locations/eu/myConfig/12345

I expected the query to keep on running after the initial run, as this is the point I see in scheduling it.


Solution

  • I have to admit, I still don't understand why it doesn't work with the method described in the documentation. I can only say that once I wrote the query and scheduled it via the console it worked...

    The SQL Query looked something like this:

    DECLARE snapshot_name STRING;
    DECLARE query STRING;
    SET snapshot_name = CONCAT("BACKUP.TABLE_",
      FORMAT_DATETIME("%Y%m%d", @run_date));
    SET query = CONCAT("CREATE SNAPSHOT TABLE ", snapshot_name,
      " CLONE PROJECT-ID.DATA-SET.TABLE;");
    EXECUTE IMMEDIATE query;