Search code examples
pysparkhadoop-yarn

How to get YARN "Memory Total" and "VCores Total" metrics programmatically in pyspark


I'm lingering around this: https://docs.actian.com/vectorhadoop/5.0/index.html#page/User/YARN_Configuration_Settings.htm

but none of those configs are what I need.

"yarn.nodemanager.resource.memory-mb" was promising, but it's only for the node manager it seems and only gets master's mem and cpu, not the cluster's.

int(hl.spark_context()._jsc.hadoopConfiguration().get('yarn.nodemanager.resource.memory-mb'))

Solution

  • You can access those metrics from Yarn History Server.
    URL: http://rm-http-address:port/ws/v1/cluster/metrics
    metrics:

    totalMB
    totalVirtualCores  
    

    Example response (can be also XML):

    {  "clusterMetrics":   {
        "appsSubmitted":0,
        "appsCompleted":0,
        "appsPending":0,
        "appsRunning":0,
        "appsFailed":0,
        "appsKilled":0,
        "reservedMB":0,
        "availableMB":17408,
        "allocatedMB":0,
        "reservedVirtualCores":0,
        "availableVirtualCores":7,
        "allocatedVirtualCores":1,
        "containersAllocated":0,
        "containersReserved":0,
        "containersPending":0,
        "totalMB":17408,
        "totalVirtualCores":8,
        "totalNodes":1,
        "lostNodes":0,
        "unhealthyNodes":0,
        "decommissioningNodes":0,
        "decommissionedNodes":0,
        "rebootedNodes":0,
        "activeNodes":1,
        "shutdownNodes":0   } }
    

    https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Metrics_API

    All you need is to figure out your Yarn History Server address and port- check in your configuration files, can't help you with this since I don't know where do you manage Yarn.

    When you have the URL, access it with python:

    import requests
    url = 'http://rm-http-address:port/ws/v1/cluster/metrics'
    reponse = requests.get(url)
    # Parse the reponse json/xml and get the relevant metrics... 
    

    Of course no Hadoop or Spark Context is needed in this solution