Using GCP, I instantiate workflows for my processing. I'd like to activate Stackdriver logging to have more metrics (see https://cloud.google.com/dataproc/docs/guides/stackdriver-logging).
From documentation, I should set the property:
dataproc:dataproc.logging.stackdriver.job.driver.enable=true
My workflow template looks like:
placement:
managedCluster:
clusterName: my-cluster
config:
gceClusterConfig:
zoneUri: europe-west1-d
masterConfig:
machineTypeUri: n1-standard-4
workerConfig:
machineTypeUri: n1-standard-4
numInstances: 10
Where should I set this property?
Thx.
The below should work.
Since the API hierarchy is deeply nested, you can build the initial template using gcloud dataproc workflow-templates
interface, describe
command will give you the correct YAML or JSON. You can then do fast iteration using instantiate-inline
from the local file.
placement:
managedCluster:
clusterName: my-cluster
config:
gceClusterConfig:
zoneUri: europe-west1-d
masterConfig:
machineTypeUri: n1-standard-4
workerConfig:
machineTypeUri: n1-standard-4
numInstances: 10
softwareConfig:
properties:
dataproc:dataproc.logging.stackdriver.job.driver.enable: true