I've a Non-HA Hadoop setup with 3 nodes: A NameNode and 2 DataNodes. The NameNode is a server with 4GB memory and 20GB hard disk while each DataNode has 8GB memory and 100GB hard disk.
Now I need to convert this to a HA cluster. I've read about two ways of doing this: using Quorum Journal Manager and using a shared storage.
What is the best way of doing this from above two?
How many additional nodes do I required on each approach?
How can I minimize need of adding new nodes using existing nodes (Is it recommended to use DataNodes and NameNodes as JournalNodes)?
I'm using Apache Hadoop version: 2.7.2 and Apache Hbase version: 1.2.4
What is the best way of doing this from above two?
QJM (Quorum Journal Manager) is recommended choice unless you have an highly reliable and fault tolerant shared storage.
How many additional nodes do I required on each approach?
One for the standby Namenode. You can run the JournalNodes and Zookeeper nodes along with the Datanodes.