Here are my cluster details:
Master : Running 1 m4.xlarge
Core : Running 3 m4.xlarge
Task : --
Cluster scaling: Not enabled
And I am using notebooks to practice pyspark. And I would like to know how the resources are being utilised, to assess if the resources are being under-utilised or not enough for my tasks. And part of which, when checking RAM/memory usage, here's what I got from terminal:
notebook@ip-xxx-xxx-xxx-xxx ~$ free -h
total used free shared buff/cache available
Mem: 1.9G 456M 759M 72K 741M 1.4G
Swap: 0B 0B 0B
Each instance of m4.xlarge comes with 16GB of memory. What's happening and why is only two gigs of 16GB being shown? And How do I properly learn how much of my CPU, Memory and Storage are actually being used? (yes, to reduce costs!!)
If you want to check memory and CPU utilization you can check that in CloudWatch with the instance Id.
Another option to use resource manager UI (YARN). default url -> http://master-node-ip:8088.
You can get metrics on job level as well as node level.