Anyone has a way to monitor a group of job clusters in Azure Databricks?
We just want to make sure the job cluster are up and running, maybe have a Dashboard or Workbook in Azure that can be red or green depending on the status of the job cluster.
We have this NRT interfaces pulling data from a source application via these job cluster and would like to see when they are down. We already get an alert when the service goes down but having a panel where we can see these interfaces would be really useful. Prhaps something that will make use of an API call would be needed unless there is something out of the box like those Ganglia reports bur haven't seen anything close to what I'm looking for.
Thanks in advance for any answer you may provide.
You can get the status of Azure Databricks Jobs by calling the API, refer below:-
Create a PAT Token like below:-
Copy the token and save it for use to call the API's in future.
I created one Databricks cluster and Job to run a Notebook like below:-
Ran the Job:-
Called the API to get the Job details like below:-
https://adb-xxxxxxxxxxxx8.18.azuredatabricks.net/api/2.1/jobs/list
Select Authorization as Bearer Token and add the PAT token that we generated above like below:-
Got output like below:-
You can configure this API and get the logs for monitoring the Job status.
You can directly check if your cluster is running or not in the event log of Azure Databricks like below:-
You can also configure Databricks logs in log4j and send it to Azure Monitor service for monitoring like below:-
You can send the above log4j logs to Azure log analytics too.
Additionally you can use ganglia and datadog to monitor Azure Databricks:-
References:-
Send Databricks app logs to Azure Monitor - Azure Architecture Center | Microsoft Learn