Search code examples
azuredatabrickscluster-computingjobsmonitor

Azure Databricks - monitor job cluster for up or down status


Anyone has a way to monitor a group of job clusters in Azure Databricks?

We just want to make sure the job cluster are up and running, maybe have a Dashboard or Workbook in Azure that can be red or green depending on the status of the job cluster.

We have this NRT interfaces pulling data from a source application via these job cluster and would like to see when they are down. We already get an alert when the service goes down but having a panel where we can see these interfaces would be really useful. Prhaps something that will make use of an API call would be needed unless there is something out of the box like those Ganglia reports bur haven't seen anything close to what I'm looking for.

Thanks in advance for any answer you may provide.


Solution

  • You can get the status of Azure Databricks Jobs by calling the API, refer below:-

    Create a PAT Token like below:-

    enter image description here

    enter image description here

    Copy the token and save it for use to call the API's in future.

    I created one Databricks cluster and Job to run a Notebook like below:-

    enter image description here

    Ran the Job:-

    enter image description here

    Called the API to get the Job details like below:-

    https://adb-xxxxxxxxxxxx8.18.azuredatabricks.net/api/2.1/jobs/list
    

    Select Authorization as Bearer Token and add the PAT token that we generated above like below:-

    enter image description here

    Got output like below:-

    enter image description here

    You can configure this API and get the logs for monitoring the Job status.

    You can directly check if your cluster is running or not in the event log of Azure Databricks like below:-

    enter image description here

    enter image description here

    You can also configure Databricks logs in log4j and send it to Azure Monitor service for monitoring like below:-

    enter image description here

    You can send the above log4j logs to Azure log analytics too.

    Additionally you can use ganglia and datadog to monitor Azure Databricks:-

    enter image description here

    References:-

    Send Databricks app logs to Azure Monitor - Azure Architecture Center | Microsoft Learn

    Manage clusters - Azure Databricks | Microsoft Learn

    Jobs API 2.1 | Databricks on AWS