I realize a Databricks cluster has a timeout, meaning after N minutes it will turn the cluster off. Here's a sample.
As nice as this feature is, though, it is not what we need. Our team works from 8AM to 6PM on weekdays. We want the cluster to would auto-start at 8AM, stay "always on" during working hours, THEN timeout after, say, 6PM. Make sense?
Q: Is this possible?
Yes, it possible to start the databricks cluster as per your team works from 8AM to 6PM on weekdays using Azure Automation.
To start at 8 AM you can use PowerShell runbook in Azure Automation to start your cluster as per the scheduled time as shown below:
PowerShell runbook should be as shown below:
$accessToken = "<Personal_Access_Token>"
$apiUrl = "<Azure_Databricks_Endpoint_URL>"
Set-DatabricksEnvironment -AccessToken $accessToken -ApiRootUrl $apiUrl
Start-DatabricksCluster -ClusterID "<Cluster_ID>"
To stop at 6 PM you can use PowerShell runbook in Azure Automation to Stop your cluster as per the scheduled time as shown below:
PowerShell runbook should be as shown below:
$accessToken = "<Personal_Access_Token>"
$apiUrl = "<Azure_Databricks_Endpoint_URL>"
Set-DatabricksEnvironment -AccessToken $accessToken -ApiRootUrl $apiUrl
Stop-DatabricksCluster -ClusterID "<Cluster_ID>"
To stop at 6 PM you can set the property Terminate after 600 minutes of inactivity.
Note: If your Business Hours (8AM to 6PM which means 10 Hours x 60 minutes) you can set the property Terminate after 600 minutes of inactivity as shown below:
This Tutorial: Start Azure Databricks clusters during business hours walks you through the creation of a PowerShell Workflow runbook to start Azure Databricks clusters during business hours in Azure Automation.