Search code examples
google-cloud-dataflowpipelinegcloud

Creating a gcloud command for Dataflow Pipeline job


We recently created a Dataflow Job and Pipeline within the Google Cloud Console. For record-keeping purposes, I want to record the gcloud equivalent commands for both the job and pipeline. I managed to determine the gcloud equivalent command for the Dataflow Job, but I am unable to figure out how to create the gcloud equivalent for the Dataflow Pipeline.

Sample Dataflow Job Gcloud command:

gcloud dataflow jobs run sample_dataflow_job --gcs-location gs://dataflow-templates-us-east1/latest/Jdbc_to_BigQuery --region us-east1 --num-workers 2 --staging-location gs://dataflow_single_region_us/writingdirectory --subnetwork https://www.googleapis.com/compute/v1/projects/sample-project/regions/us-east1/subnetworks/project_network --disable-public-ips --parameters connectionURL=jdbc:mysql://psql.gcp.sample.net:6033/sample,driverClassName=com.mysql.cj.jdbc.Driver,query=select * from datab,outputTable=bigquerytable:sample.sample_DataFlow_1,driverJars=gs://dataflow_single_region_us/jdbc_driver,bigQueryLoadingTemporaryDirectory=gs://dataflow_single_region_us/bigqueryloading,username=johnsmith,password=Password1

Any ideas how I can get the gcloud command for the Dataflow Pipeline solution?


Solution

  • I dig a bit on cloud sdk documentation and found that the feature was released a couple of days ago, please check Google Cloud CLI - Release Notes about cloud datapipelines. Still, it's in beta. You can check gcloud beta datapipelines page for additional details.

    As this feature is still in beta it may have limited support. It's not recommended to use this in production environments and it's only for testing at the moment. For now, I think we will have to wait until it's fully released.