We recently created a Dataflow Job and Pipeline within the Google Cloud Console.
For record-keeping purposes, I want to record the gcloud
equivalent commands for both the job and pipeline.
I managed to determine the gcloud
equivalent command for the Dataflow Job, but I am unable to figure out how to create the gcloud
equivalent for the Dataflow Pipeline.
Sample Dataflow Job Gcloud command:
gcloud dataflow jobs run sample_dataflow_job --gcs-location gs://dataflow-templates-us-east1/latest/Jdbc_to_BigQuery --region us-east1 --num-workers 2 --staging-location gs://dataflow_single_region_us/writingdirectory --subnetwork https://www.googleapis.com/compute/v1/projects/sample-project/regions/us-east1/subnetworks/project_network --disable-public-ips --parameters connectionURL=jdbc:mysql://psql.gcp.sample.net:6033/sample,driverClassName=com.mysql.cj.jdbc.Driver,query=select * from datab,outputTable=bigquerytable:sample.sample_DataFlow_1,driverJars=gs://dataflow_single_region_us/jdbc_driver,bigQueryLoadingTemporaryDirectory=gs://dataflow_single_region_us/bigqueryloading,username=johnsmith,password=Password1
Any ideas how I can get the gcloud
command for the Dataflow Pipeline solution?
I dig a bit on cloud sdk documentation and found that the feature was released a couple of days ago, please check Google Cloud CLI - Release Notes about cloud datapipelines
. Still, it's in beta. You can check gcloud beta datapipelines page for additional details.
As this feature is still in beta
it may have limited support. It's not recommended to use this in production environments and it's only for testing at the moment. For now, I think we will have to wait until it's fully released.