I would like to collect aggregate usage metrics from a Cloudera 5.4.4 Hadoop cluster. Some of the metrics in my mind are as below:
Are there any APIs/resources/tools etc that I could use for starting with this? I don't think I am entirely sure of where to begin from. Any starting point would be greatly appreciated. Also, please do share your experience with cluster usage metrics, if you have had any.
Thanks in advance!
Ganglia is an open-source, scalable and distributed monitoring system for large clusters. It collects, aggregates and provides time-series views of tens of machine-related metrics such as CPU, memory, storage, network usage
. You can see Ganglia in action at UC Berkeley Grid.
Ganglia is also a popular solution for monitoring Hadoop and HBase clusters, since Hadoop (and HBase) has built-in support for publishing its metrics to Ganglia. With Ganglia you may easily see the number of bytes written by a particular HDSF datanode over time, the block cache hit ratio for a given HBase region server, the total number of requests to the HBase cluster, time spent in garbage collection and many, many others.
ref- http://hakunamapdata.com/ganglia-configuration-for-a-small-hadoop-cluster-and-some-troubleshooting/