Search code examples
hadoopganglia

Monitering Hadoop multi node cluster by Ganglia


I want to monitor Hadoop (Hadoop version-0.20.2) multi node cluster using ganglia. My Hadoop is working properly.I have installed Ganglia after reading following blogs---

http://hakunamapdata.com/ganglia-configuration-for-a-small-hadoop-cluster-and-some-troubleshooting/

http://hokamblogs.blogspot.in/2013/06/ganglia-overview-and-installation-on.html

I have also studied Monitoring with Ganglia.pdf(APPENDIX B Ganglia and Hadoop/HBase ). ​

I have modified only the  following lines in **Hadoop-metrics.properties**(same on all Hadoop Nodes)==>



// Configuration of the "dfs" context for ganglia
 dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
 dfs.period=10
 dfs.servers=192.168.1.182:8649

// Configuration of the "mapred" context for ganglia
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.period=10
mapred.servers=192.168.1.182:8649:8649


// Configuration of the "jvm" context for ganglia
 jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
 jvm.period=10
 jvm.servers=192.168.1.182:8649


 **gmetad.conf** (Only on Hadoop master Node )


data_source "Hadoop-slaves" 5 192.168.1.182:8649
RRAs "RRA:AVERAGE:0.5:1:302400" //Because i want  to analyse one week data.



 **gmond.conf** (on all the Hadoop Slave nodes and Hadoop Master)

globals {
  daemonize = yes
  setuid = yes
  user = ganglia
  debug_level = 0
  max_udp_msg_len = 1472
  mute = no
  deaf = no
  allow_extra_data = yes
  host_dmax = 0 /*secs */
  cleanup_threshold = 300 /*secs */
  gexec = no
  send_metadata_interval = 0
}

cluster {
  name = "Hadoop-slaves"
  owner = "Sandeep Priyank"
  latlong = "unspecified"
  url = "unspecified"
}

/* The host section describes attributes of the host, like the location */
host {
  location = "CASL"
}

/* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
udp_send_channel {
  host = 192.168.1.182
  port = 8649
  ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
  port = 8649

}

/* You can specify as many tcp_accept_channels as you like to share
   an xml description of the state of the cluster */
tcp_accept_channel {
  port = 8649
 }

Now Ganglia is only giving system metrics(mem , disk etc.) for all the nodes. But it is not showing the Hadoop metrics( like jvm, mapred metrics etc. ) on the web interface. how can i fix this problem ?


Solution

  • Thanks to everyone, If you are using older version of Hadoop then put following files( from new version of Hadoop) ==>

    1. GangliaContext31.java

    2. GangliaContext.java

    In path ==> hadoop/src/core/org/apache/hadoop/metrics/ganglia From the new version of Hadoop.

    Compile your Hadoop using ant ( and set proper proxy while compiling). If it gives error like function definition is missing then put that function definition( from new version) in proper java file and then compile Hadoop again. It will work.