Search code examples
google-cloud-platformgoogle-cloud-dataflowgclouddataflow

gcloud beta dataflow missing documentation on parameters


When I try to run the following:

gcloud beta dataflow jobs run $JOB_NAME \
--region $GCP_REGION \
--gcs-location gs://dataflow-templates/latest/PubSub_to_BigQuery \
--network $VPC_NAME \
--subnetwork regions/$GCP_REGION/subnetworks/$VPC_SUBNETWORK \
--staging-location gs://$GCP_PROJECT_ID-assoc-history-dataflow \
--worker-machine-type n1-standard-1 \
--parameters \
"inputTopic=$INPUT_TOPIC,\
outputTableSpec=$OUTPUT_SPEC,\ 
javascriptTextTransformGcsPath=$UDF_LOCATION,\
javascriptTextTransformFunctionName=myFunctionName"

gcloud gives me the following error:

ERROR: (gcloud.beta.dataflow.jobs.run) INVALID_ARGUMENT: The template parameters are invalid. - '@type': type.googleapis.com/google.dataflow.v1beta3.InvalidTemplateParameters parameterViolations: - description: Unrecognized parameter parameter: |- \ javascriptTextTransformGcsPath

It seems like javascriptTextTransformGcsPath is provided in the parameters according to the current documentation (https://cloud.google.com/dataflow/docs/guides/templates/provided-utilities). I'm using the beta gcloud sdk to specify my VPC and subnet. Is there a link to documentation on how the JavaScript udf parameter has changed or is there another way to specify that?


Solution

  • This is because you are using quotation marks (") in parameters. You can check the documentation. There is example:

    gcloud dataflow jobs run JOB_NAME \
    --gcs-location gs://dataflow-templates/latest/Bulk_Decompress_GCS_Files \
    --parameters \
    inputFilePattern=gs://YOUR_BUCKET_NAME/compressed/*.gz,\
    outputDirectory=gs://YOUR_BUCKET_NAME/decompressed,\
    outputFailureFile=OUTPUT_FAILURE_FILE_PATH
    

    Just remove it and it should work.