I have created a Google Cloud Run service that performs a BigQuery ETL operation in response to a BigQuery event being written to the audit log. My service is written as a Python Flask app and it follows the principles given in How to trigger Cloud Run actions on BigQuery events. More specifically, the service is triggered by Eventarc when Google Analytics data are imported into BigQuery.
I can test this locally by starting the app in a Docker container and sending the service a POST request that contains JSON from an appropriate audit log entry. It works as expected: the ETL operation is performed and no errors are returned.
The app deploys to the Google Cloud without issue. Eventarc correctly triggers the service when the Google Analytics import is complete. The service runs as expected, correctly performing the ETL operation and returning 200 OK response. But then the service is repeatedly invoked with the same event. This loop only stops when the next Eventarc trigger is activated.
The ETL operation takes approximately 60 seconds. If I replace the ETL operation with a time.sleep(60)
statement, the same problem occurs, as it does at 10 seconds too. However, if I remove the ETL operation and sleep altogether, the retry loop stops.
Finally, the Metrics Explorer shows a series of webhook_timeout
responses for "Cloud Pub/Sub Subscription - Push Requests".
All of this suggests to me that "the system" is retrying the event because it is taking too long. But why? And how do I fix it?
$ gcloud run services describe XXX-svc
✔ Service XXX-svc in region XXX
URL: https://XXX
Ingress: internal
Traffic:
100% LATEST (currently XXX)
Last updated on 2022-08-04T08:27:05.918172Z by XXX:
Revision XXX
Image: XXX
Port: 8080
Memory: 512Mi
CPU: 1000m
Service account: XXX
Concurrency: 80
Min Instances: 1
Max Instances: 1
Timeout: 300s
$ gcloud --project="${PROJECT}" eventarc triggers describe XXX-trigger --location=XXX
createTime: '2022-08-04T06:59:33.232085395Z'
destination:
cloudRun:
region: XXX
service: XXX-svc
eventFilters:
- attribute: resourceName
operator: match-path-pattern
value: projects/XXX/jobs/*
- attribute: type
value: google.cloud.audit.log.v1.written
- attribute: serviceName
value: bigquery.googleapis.com
- attribute: methodName
value: google.cloud.bigquery.v2.JobService.InsertJob
name: projects/XXX/locations/XXX/triggers/XXX-trigger
serviceAccount: XXX
transport:
pubsub:
subscription: projects/XXX/subscriptions/eventarc-XXX-XXX-trigger-sub-724
topic: projects/XXX/topics/eventarc-XXX-XXX-trigger-724
uid: XXX
updateTime: '2022-08-04T10:15:33.683873843Z'
Thanks to the accepted answer from @guillaume blaquiere and the comment from @Pentium10, I was able to update the Pub/Sub subscription acknowledgement deadline:
# List Eventarc trigger names.
gcloud \
--project="${PROJECT}" \
eventarc triggers list \
--format='value(name)'
TRIGGER="..."
# Get the Eventarc trigger Pub/Sub subscription name.
PUBSUB=$(gcloud \
--project="${PROJECT}" \
eventarc triggers describe "${TRIGGER}" \
--format='value(transport.pubsub.subscription)')
# Describe the subscription.
gcloud \
--format=json \
pubsub subscriptions describe "${PUBSUB}"
# Update the acknowledgement deadline.
gcloud \
pubsub subscriptions update "${PUBSUB}" \
--ack-deadline=300
That's correct. Eventarc is backed on PubSub, and a PubSub subscription, by default, expect an answer in the 10 seconds.
That's the default configuration of Eventarc.
And because your event processing take 60 seconds, it repeat the event in loop...
I got the same issue and I shared it with the PM. For now, there is nothing in eventarc (API or in Terraform (my case)) to fix that.
BUT because it's backed on PubSub, you can update the PubSub subscription and update the acknowledgement deadline. The name of the subscription is eventarc-<REGION or GLOBAL>-<EVENTARC NAME>-sub-<Random suffix>