Search code examples
google-cloud-platformgoogle-bigquery

Duplicate events in Eventarc triggered Google Cloud Run service


I have created a Google Cloud Run service that performs a BigQuery ETL operation in response to a BigQuery event being written to the audit log. My service is written as a Python Flask app and it follows the principles given in How to trigger Cloud Run actions on BigQuery events. More specifically, the service is triggered by Eventarc when Google Analytics data are imported into BigQuery.

I can test this locally by starting the app in a Docker container and sending the service a POST request that contains JSON from an appropriate audit log entry. It works as expected: the ETL operation is performed and no errors are returned.

The app deploys to the Google Cloud without issue. Eventarc correctly triggers the service when the Google Analytics import is complete. The service runs as expected, correctly performing the ETL operation and returning 200 OK response. But then the service is repeatedly invoked with the same event. This loop only stops when the next Eventarc trigger is activated.

  • The source event upon which Eventarc acts appears in the audit log only once.
  • My service logs the event JSON, enabling me to confirm that the service is indeed receiving the same event repeatedly.
  • The time between "retries" varies, but can be anything from a few seconds to around 10 minutes.
  • The retries continue even after removing and re-deploying the service and Eventarc trigger.
  • If I use curl to POST an event, the problem does not occur.

The ETL operation takes approximately 60 seconds. If I replace the ETL operation with a time.sleep(60) statement, the same problem occurs, as it does at 10 seconds too. However, if I remove the ETL operation and sleep altogether, the retry loop stops.

Finally, the Metrics Explorer shows a series of webhook_timeout responses for "Cloud Pub/Sub Subscription - Push Requests".

All of this suggests to me that "the system" is retrying the event because it is taking too long. But why? And how do I fix it?

$ gcloud run services describe XXX-svc
✔ Service XXX-svc in region XXX

URL:     https://XXX
Ingress: internal
Traffic:
  100% LATEST (currently XXX)

Last updated on 2022-08-04T08:27:05.918172Z by XXX:
  Revision XXX
  Image:           XXX
  Port:            8080
  Memory:          512Mi
  CPU:             1000m
  Service account: XXX
  Concurrency:     80
  Min Instances:   1
  Max Instances:   1
  Timeout:         300s

$ gcloud --project="${PROJECT}" eventarc triggers describe XXX-trigger --location=XXX
createTime: '2022-08-04T06:59:33.232085395Z'
destination:
  cloudRun:
    region: XXX
    service: XXX-svc
eventFilters:
- attribute: resourceName
  operator: match-path-pattern
  value: projects/XXX/jobs/*
- attribute: type
  value: google.cloud.audit.log.v1.written
- attribute: serviceName
  value: bigquery.googleapis.com
- attribute: methodName
  value: google.cloud.bigquery.v2.JobService.InsertJob
name: projects/XXX/locations/XXX/triggers/XXX-trigger
serviceAccount: XXX
transport:
  pubsub:
    subscription: projects/XXX/subscriptions/eventarc-XXX-XXX-trigger-sub-724
    topic: projects/XXX/topics/eventarc-XXX-XXX-trigger-724
uid: XXX
updateTime: '2022-08-04T10:15:33.683873843Z'

Update

Thanks to the accepted answer from @guillaume blaquiere and the comment from @Pentium10, I was able to update the Pub/Sub subscription acknowledgement deadline:

# List Eventarc trigger names.
gcloud \
  --project="${PROJECT}" \
  eventarc triggers list \
  --format='value(name)'

TRIGGER="..."

# Get the Eventarc trigger Pub/Sub subscription name.
PUBSUB=$(gcloud \
  --project="${PROJECT}" \
  eventarc triggers describe "${TRIGGER}" \
  --format='value(transport.pubsub.subscription)')

# Describe the subscription.
gcloud \
  --format=json \
  pubsub subscriptions describe "${PUBSUB}"

# Update the acknowledgement deadline.
gcloud \
  pubsub subscriptions update "${PUBSUB}" \
  --ack-deadline=300

Solution

  • That's correct. Eventarc is backed on PubSub, and a PubSub subscription, by default, expect an answer in the 10 seconds.

    That's the default configuration of Eventarc.

    And because your event processing take 60 seconds, it repeat the event in loop...


    I got the same issue and I shared it with the PM. For now, there is nothing in eventarc (API or in Terraform (my case)) to fix that.

    BUT because it's backed on PubSub, you can update the PubSub subscription and update the acknowledgement deadline. The name of the subscription is eventarc-<REGION or GLOBAL>-<EVENTARC NAME>-sub-<Random suffix>