Search code examples
hadoopmapreducehdfshadoop-yarnopenstack

Hadoop: how to correctly decommission nodes on cluster?


I've been trying to change number of nodes in my Hadoop cluster (in total 5 nodes including 1 master and 4 workers) by following this solution change number of data nodes in Hadoop and this useful post Commissioning and Decommissioning of Datanode in Hadoop Cluster

Now I can see that on HDFS I have successfully decommissioned one node: HDFS screenshot

I have set up a exclude file property in my hdfs-site.xml hdfs.xml screenshot as well as yarn-site.xml yarn-site.xml screenshot which includes the ip address of the node that I want to decommission. Like follow: excludes file

Also I have done refreshing the nodes commands.

Finally I run hadoop dfsadmin –report and I can see the node is actually decommissioned : nodes report

However, on MapReduce cluster metrics I found that there are 5 active nodes and the node that has been decommissioned on HDFS is not indentified in decommissioned nodes here.

see: hadoop cluster metrics

Why is that?


Solution

  • The issue solves when I changed name of the host on exclude file. The node name should not contain any port number