Search code examples
google-cloud-platformvirtual-machinecluster-computingshutdowngoogle-cloud-dataproc

How can I prevent Google Cloud Dataproc cluster VM instances from auto-shutoff?


When running vm instance cluster+ nodes even if I am using and running things on the cluster/ dataproc, the vm instance shuts off automatically after about 30 minutes or so. I cannot find this setting and would appreciate any help re: how to disable this to prevent it from shutting off or even how to configure a new cluster in a way that will prevent this from happening.

Thank you


Solution

  • Default Dataproc clusters do not have any kind of automatic shutdown.

    If you are using the older Datalab initialization action, you are probably seeing Datalab's own non-Dataproc-aware shutdown functionality, which you can disable one of the ways suggested here: How to keep Google Dataproc master running?

    Otherwise, if you're using some kind of template or copy/paste arguments for creating your Dataproc cluster, perhaps you're accidentally setting "scheduled deletion": https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/scheduled-deletion

    If neither of those settings explain your situation, you should visit your "activity logs" from the "Cloud Logging" interface, selecting Cloud Dataproc Cluster, and opening up the activity_log type of logs to see an audit log of who was deleting your cluster. Alternatively, if the cluster still existed in Dataproc, but the underlying VM was being shut down, visit the "Compute Engine VM" log category and also look at "activity logs" to see who was stopping your VMs. Sometimes, in a shared project, a project admin might be running some kind of script to automatically shut down VMs to save cost.