Search code examples
google-cloud-platformgoogle-cloud-storagegoogle-cloud-dataflowgoogle-cloud-spanner

GCP Dataflow streaming job/pipeline (unsuccessfully) draining for more than 24h-48h


I have a Dataflow job that reads from a Cloud Spanner Change Stream and writes it into GCS.

I started draining the Job 36-48h ago, but the job never left the draining state.

I then also canceled the Pipeline that initiated the Dataflow Job initially. Even after canceling the pipeline (while the Job was draining), after an additional 24h the Job is still draining, and incurring costs.

There seems to be absolutely no SLA - the job did not fail or get stuck, it simply seems like the Job is not able to drain/ resolve gracefully. I know that I can cancel and force-cancel it, but it's worrisome that the default graceful shutdown doesn't work under healthy/normal conditions.

Any suggestions on what's going on here?


Solution

  • The default Spanner Change Streams to Bigquery template currently doesn't support draining a job.