Search code examples
hadoophbase

How to add a Secondary NameNode in a HBase cluster setup?


I've a Hbase cluster setup with 3 nodes: A NameNode and 2 DataNodes. The NameNode is a server with 4GB memory and 20GB hard disk while each DataNode has 8GB memory and 100GB hard disk.

I'm using Apache Hadoop version: 2.7.2 and Apache Hbase version: 1.2.4

I've seen some people mentioned about a Secondary NameNode.

My questions are,

  1. What is the impact of not having a Secondary NameNode in my setup?
  2. Is it possible to use one of the DataNodes as the Secondary NameNode?
  3. If possible how can I do it? (I inserted only the NameNode in /etc/hadoop/masters file.)

Solution

    1. What is the impact of not having a Secondary NameNode in my setup?

    SecondaryNamenode does the job of periodically merging the namespace image with the edit log (called as checkpointing). Your setup is not an High-Availability setup, thus not having one will cause the edit log to grow large in size which would eventually add an overhead to the NameNode during startup.

    1. Is it possible to use one of the DataNodes as the Secondary NameNode?

    Running the SNN in a Datanode host is not recommended. A separate host is preferred to run the Secondary Namenode process. The host chosen for SNN must have identical memory as the NN.

    1. If possible how can I do it? (I inserted only the NameNode in /etc/hadoop/masters file.)

    masters file is not in use anymore. Add this property in hdfs-site.xml

    <property>
       <name>dfs.namenode.secondary.http-address</name>
       <value>SNN_host:50090</value>
    </property>
    

    Also note that, SecondaryNamenode process is started by default in the node where start-dfs.sh is executed.