I'm lingering around this: https://docs.actian.com/vectorhadoop/5.0/index.html#page/User/YARN_Configuration_Settings.htm
but none of those configs are what I need.
"yarn.nodemanager.resource.memory-mb" was promising, but it's only for the node manager it seems and only gets master's mem and cpu, not the cluster's.
You can access those metrics from Yarn History Server.
URL: http://rm-http-address:port/ws/v1/cluster/metrics
Example response (can be also XML):
{ "clusterMetrics": {
"shutdownNodes":0 } }
All you need is to figure out your Yarn History Server address and port- check in your configuration files, can't help you with this since I don't know where do you manage Yarn.
When you have the URL, access it with python:
import requests
url = 'http://rm-http-address:port/ws/v1/cluster/metrics'
reponse = requests.get(url)
# Parse the reponse json/xml and get the relevant metrics...
Of course no Hadoop or Spark Context is needed in this solution