Search code examples
azureazure-hdinsight

How to get cluster details like clusterID from code running on hdinsight cluster


I need ClusterInner object using Azure API or some cluster information like cluster id etc.

But to get ClusterInner object or cluster ID I need to provide the authentication object to API, but this code will be running on same HDInsight cluster so ideally it worn't ask for credential or use some env etc (My spark job already running on this cluster and spark job need this information).

Is there any API or alternative there to get this information from same running HDInsight cluster.


Solution

  • Editing my answer as per the comment

    This particular method is not a clean way but you can get the details. Note: This is applicable only for HDInsight clusters. The deployment details can be extracted only from the head nodes. You have to look for the field in /etc/ambari-server/conf/ambari.properties server.jdbc.database_name.For example if it is v40e8b2c1e26279460ca3e8c0cbc75af8f8AmbariDb then you can trim out first 3 characters and last 8 characters of the String.The left out string will be your clusterid. You can use Linux script within your job to extract the details from the file. Below is the shell command

     #!/bin/bash
        string=$(sed -n 's/server.jdbc.database_name=//p' /etc/ambari-server/conf/ambari.properties)
        POS=3
        LEN=32
        clusterid=${string:$POS:$LEN} 
    

    You can embed the script in Python/Java.I am using Python to achieve this

    import os
    import subprocess
    subprocess.call(['sh', '/path/to/script.sh'])