I have set up a Hadoop cluster of 5 virtual machines , using plain vanilla Hadoop. The cluster details are below:
192.168.1.100 - Configured to Run NameNode and SNN daemons
192.168.1.101 - Configured to Run ResourceManager daemon.
192.168.1.102 - Configured to Run DataNode and NodeManager daemons.
192.168.1.103 - Configured to Run DataNode and NodeManager daemons.
192.168.1.104 - Configured to Run DataNode and NodeManager daemons.
I have kept masters and slaves files in each virtual servers.
192.168.1.100
192.168.1.101
192.168.1.102
192.168.1.103
192.168.1.104
Now when I run start-all.sh
command from NameNode
machine, how is it able to start all the daemons? I am not able to understand it? There are no adapters installed (or I am not aware of), there are simple hadoop jars present in all the machines so how is NameNode
machine able to start all the daemons in all the machines (virtual servers).
Can anyone help me understand this?
The namenode connects to the slaves via SSH and runs the slave services.
That is why you need public ssh-keys in ~/.ssh/authorized_keys
on the slaves, to have their private counterparts be present for the user running the Hadoop namenode.