Search code examples
google-cloud-platformgoogle-cloud-dataflowapache-beamgoogle-cloud-spanner

Unable to launch GCP Dataflow pipeline (Spanner to GCS) using default templates


I am trying to get a default GCP Dataflow pipeline template to run (Cloud Spanner to GCS), but all my attempts to start the Pipeline/Job fail with a message indicating that a result file is missing. I've not modified any default options of the template.

Failed to read the result file : 
gs://dataflow-staging-us-central1-11111111111/staging/template_launches/2023-06-19_10_55_21-884192550311219509/operation_result with error message: 
(8bac83beae18b544): Unable to open template file: gs://dataflow-staging-us-central1-11111111111/staging/template_launches/2023-06-19_10_55_21-884192550311219509/operation_result..

Interestingly, I've managed to get the Pipeline working once, a few days ago. Stopped the pipeline, then attempted it again today, and all variations failed.

  • I've recreated Spanner instances and DBs, and Spanner Streams
  • I've re-created GCS buckets
  • I've used the web interface and gcloud
  • I've tried above in different regions

At this point, I am clueless why that one attempt a few days ago worked, and all of my current efforts fail

Any idea why the jobs keep failing?

enter image description here

enter image description here


Solution

  • Solution

    Let me preface: I have tried everything using the Google Cloud Console (web UI) - I wasn't able to get a pipeline running. However, creating a job with the gcloud CLI worked.

    1. Create a service account with permissions as described here
    2. Create GCS bucket (and add the just created service account as principal)
    3. gcloud auth application-default login in your local shell
    4. Execute the below command. Update the us-central1 in the template file location and everywhere else to match your region. More params here
    gcloud dataflow flex-template run spanner-to-bigquery \
        --template-file-gcs-location=gs://dataflow-templates-us-central1/2023-06-06-00_RC00/flex/Spanner_Change_Streams_to_BigQuery \
        --region us-central1 \
        --project=<your_gcs_project> \
        --service-account-email=dataflow-spanner-to-bq@<your_gcs_project>.iam.gserviceaccount.com \
        --parameters \
    spannerInstanceId=development,\
    spannerDatabase=main,\
    spannerMetadataInstanceId=development,\
    spannerMetadataDatabase=main-meta,\
    spannerChangeStreamName=AllStream,\
    bigQueryDataset=dev_spanner_all,\
    numWorkers=1,\
    enableStreamingEngine=true,\
    tempLocation=gs://create_a_gs_bucket/tmp,\
    stagingLocation=gs://create_a_gs_bucket/staging,\
    workerRegion=us-central1
    

    Previously

    This isn't a solution, but merely a poor work-around. It seems that this is a GCP issue that should be addressed (or at least some type of resolvable error should be displayed)


    Not per se a solution, but creating a new project in GCP, and replicating a simple Spanner, GCS, and Dataflow setup worked in a new environment (GCP Project) without any issues.

    It seems as if there are meta data or fragments that prevent the dataflow pipelines in my other project from launching correctly.


    EDIT

    Out of curiosity, I've attempted to disable all related APIs in the GCP project in which the Dataflow launch keeps failing. Even after disabling and re-enabling them, it still keeps failing on launch.