Search code examples
hadoophortonworks-data-platformganglia

Ambari dashboard retrieving no statistics


I have a fresh install of Hortonworks Data Platform 2.2 installed on a small cluster (4 machines) but when I login to the Ambari GUI, the majority of dashboard stats boxes (HDFS disk usage, Network usage, Memory usage etc) are not populated with any statistics, instead they show the message:

No data There was no data available.  Possible reasons include inaccessible Ganglia service

Clicking on the HDFS service link gives the following summary:

NameNode    Started
SNameNode   Started
DataNodes   4/4 DataNodes Live
NameNode Uptime     Not Running
NameNode Heap   n/a / n/a (0.0% used)
DataNodes Status    4 live / 0 dead / 0 decommissioning
Disk Usage (DFS Used)   n/a / n/a (0%)
Disk Usage (Non DFS Used)   n/a / n/a (0%)
Disk Usage (Remaining)  n/a / n/a (0%)
Blocks (total)  n/a
Block Errors    n/a corrupt / n/a missing / n/a under replicated
Total Files + Directories   n/a
Upgrade Status  Upgrade not finalized
Safe Mode Status    n/a

The Alerts and Health Checks box to the right of the screen is not displaying any information but if I click on the settings icon this opens the Nagios frontend and again, everything looks healthy here!

The install went smoothly (CentOS 6.5) and everything looks good as far as all services are concerned (all started with green tick next to service name). There are some stats displayed on the dashboard: 4/4 datanodes are live, 1/1 Nodemanages live & 1/1 Supervisors are live. I can write files to HDFS so its looks like it's a Ganglia issue?

The Ganglia daemon seems to be working ok:

ps -ef | grep gmond
nobody    1720     1  0 12:54 ?        00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPHistoryServer/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPHistoryServer/gmond.pid
nobody    1753     1  0 12:54 ?        00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPFlumeServer/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPFlumeServer/gmond.pid
nobody    1790     1  0 12:54 ?        00:00:48 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPHBaseMaster/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPHBaseMaster/gmond.pid
nobody    1821     1  1 12:54 ?        00:00:57 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPKafka/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPKafka/gmond.pid
nobody    1850     1  0 12:54 ?        00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPSupervisor/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPSupervisor/gmond.pid
nobody    1879     1  0 12:54 ?        00:00:45 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPSlaves/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPSlaves/gmond.pid
nobody    1909     1  0 12:54 ?        00:00:48 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPResourceManager/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPResourceManager/gmond.pid
nobody    1938     1  0 12:54 ?        00:00:50 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPNameNode/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPNameNode/gmond.pid
nobody    1967     1  0 12:54 ?        00:00:47 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPNodeManager/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPNodeManager/gmond.pid
nobody    1996     1  0 12:54 ?        00:00:44 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPNimbus/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPNimbus/gmond.pid
nobody    2028     1  1 12:54 ?        00:00:58 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPDataNode/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPDataNode/gmond.pid
nobody    2057     1  0 12:54 ?        00:00:51 /usr/sbin/gmond --conf=/etc/ganglia/hdp/HDPHBaseRegionServer/gmond.core.conf --pid-file=/var/run/ganglia/hdp/HDPHBaseRegionServer/gmond.pid

I have checked the Ganglia service on each node, the processes are running as expected

ps -ef | grep gmetad
nobody    2807     1  2 12:55 ?        00:01:59 /usr/sbin/gmetad --conf=/etc/ganglia/hdp/gmetad.conf --pid-file=/var/run/ganglia/hdp/gmetad.pid

I have tried restarting Ganglia services with no luck, restarted all services but still the same. Does anyone have any ideas how I get the dashboard to work properly? Thank you.


Solution

  • It turns out to be a proxy issue, to access the internet I had to add my proxy details to the file /var/lib/ambari-server/ambari-env.sh

    export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -Dhttp.proxyHost=theproxy -Dhttp.proxyPort=80 -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false'
    

    When ganglia was trying to access each node in the cluster the request was going via the proxy and never resolving, to overcome the issue I added my nodes to the exclude list (add the flag -Dhttp.nonProxyHosts) like so:

    export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -Dhttp.proxyHost=theproxy -Dhttp.proxyPort=80 -Dhttp.nonProxyHosts="localhost|node1.dms|node2.dms|node3.dms|etc" -Djava.security.auth.login.config=/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false'
    

    After adding the exclude list the stats were retrieved as expected!