Search code examples
google-cloud-storagegoogle-cloud-dataflowgoogle-authenticationbad-request

Google Dataflow Pipeline creation fails with 400: Bad Request / invalid grant


I have been building and creating templates for google dataflow for over a year now. I never had a problem creating templates and uploading them to gcs with the options.setTemplateLocation(templatePath); call. Since today, when creating the Pipeline with Pipeline.create(options); and running the java-program in eclipse, I get following exception:

Exception in thread "main" java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:233)
    at org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:162)
    at org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:52)
    at org.apache.beam.sdk.Pipeline.create(Pipeline.java:142)
    at mypackage.PipelineCreation.getTemplatePipeline(PipelineCreation.java:34)
    at myotherpackage.Main.main(Main.java:51)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:222)
    ... 5 more
Caused by: java.lang.RuntimeException: Unable to verify that GCS bucket gs://my-projects-staging-bucket exists.
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPathIsAccessible(GcsPathValidator.java:92)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.validateOutputFilePrefixSupported(GcsPathValidator.java:61)
    at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions(DataflowRunner.java:228)
    ... 10 more
Caused by: com.google.api.client.http.HttpResponseException: 400 Bad Request
{
  "error" : "invalid_grant",
  "error_description" : "Bad Request"
}
    at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1070)
    at com.google.auth.oauth2.UserCredentials.refreshAccessToken(UserCredentials.java:207)
    at com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:149)
    at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:135)
    at com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
    at com.google.cloud.hadoop.util.ChainingHttpRequestInitializer.initialize(ChainingHttpRequestInitializer.java:52)
    at com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:93)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:300)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.hadoop.util.ResilientOperation$AbstractGoogleClientRequestExecutor.call(ResilientOperation.java:166)
    at com.google.cloud.hadoop.util.ResilientOperation.retry(ResilientOperation.java:66)
    at org.apache.beam.sdk.util.GcsUtil.getBucket(GcsUtil.java:505)
    at org.apache.beam.sdk.util.GcsUtil.bucketAccessible(GcsUtil.java:492)
    at org.apache.beam.sdk.util.GcsUtil.bucketAccessible(GcsUtil.java:457)
    at org.apache.beam.sdk.extensions.gcp.storage.GcsPathValidator.verifyPathIsAccessible(GcsPathValidator.java:88)
    ... 12 more

I was logged-in today with another account into gcloud but logged in again with the account associated with the project as "Owner" with gcloud auth login. I also restarted Eclipse but the same error keeps occuring. Also when trying to run the pipeline locally, I get another error but also with the "invalid_grant" "bad request" content. Restarting the laptop also had no effect.

My pom defines the google-cloud-dataflow-java-sdk-all with version 2.2.0 and upgrading to 2.5.0 had no effect.

I am able to copy data to the bucket with gsutil from commandline. But when running the java-program from command-line with mvn compile exec:java -Dexec.mainClass=mypackage.Main i still get the same errors.

My function to create a templatePipeline looks like the following:

public static Pipeline getTemplatePipeline(String jobName, String templatePath){
        DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
        options.setProject("my-project-id");
        options.setRunner(DataflowRunner.class);
        options.setStagingLocation("gs://my-projects-staging-bucket/binaries");
        options.setTempLocation("gs://my-projects-staging-bucket/binaries/tmp");
        options.setGcpTempLocation("gs://my-projects-staging-bucket/binaries/tmp");
        options.setZone("europe-west3-a");
        options.setWorkerMachineType("n1-standard-2");
        options.setJobName(jobName);
        options.setMaxNumWorkers(2);
        options.setDiskSizeGb(40);
        options.setTemplateLocation(templatePath);
        return Pipeline.create(options);
    }

Any help is highly appreciated.


Solution

  • I found the solution in the quickstart docs.

    It seems like the gcloud auth is no longer used and you have to use a service account. So like in the docs I created a service account with role "project/owner" and downloaded it's json file to $path.

    Then on my Mac i used export GOOGLE_APPLICATION_CREDENTIALS="$path" and within the same session used the command mentioned in the question to compile and execute the java-program.