Search code examples
apache-sparkgcloudgoogle-cloud-dataproc

(gcloud.dataproc.batches.submit.spark) unrecognized arguments: --subnetwork=


I am trying to submit google dataproc batch job. As per documentation Batch Job, we can pass subnetwork as parameter. But when use, it give me

ERROR: (gcloud.dataproc.batches.submit.spark) unrecognized arguments: --subnetwork=

Here is gcloud command I have used,

gcloud dataproc batches submit spark \
    --region=us-east4 \
    --jars=file:///usr/lib/spark/examples/jars/spark-examples.jar \
    --class=org.apache.spark.examples.SparkPi \
     --subnetwork="https://www.googleapis.com/compute/v1/projects/myproject/regions/us-east4/subnetworks/network-svc" \
    -- 1000

Solution

  • According to dataproc batches docs, the subnetwork URI needs to be specified using argument --subnet.

    Try:

    gcloud dataproc batches submit spark \
        --region=us-east4 \
        --jars=file:///usr/lib/spark/examples/jars/spark-examples.jar \
        --class=org.apache.spark.examples.SparkPi \
        --subnet="https://www.googleapis.com/compute/v1/projects/myproject/regions/us-east4/subnetworks/network-svc" \
        -- 1000