I'm having trouble setting up Hadoop. My setup consists of a nameNode VM and two seperate physical dataNodes that are connected to the same network.
IP configuration:
I keep getting the error that there are 0 datanodes running, but when I do JPS on my dataNode-1 machine or dataNode-2 machine, it shows up as running. My nameNode log shows this:
File /user/hadoop/.bashrc_COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
The logs on my dataNode-1 machine tell me that it has trouble connecting to the nameNode.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: namenode-1/192.168.118.212:9000
Only weird part is that it can't connect, though it can start it? I can also SSH between all of them with no problems.
So my best guess would be that I've configured the one of the config files incorrectly, though I checked other questions on here and they seem to be correct.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://namenode-1:9000/</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hadoop_data/hdfs/datanode</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hadoop_data/hdfs/namenode</value>
<final>true</final>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>namenode-1:9001</value>
</property>
</configuration>
The problem was the firewall.
You can stop it by running systemctl stop firewalld.service
I found the answer here: https://stackoverflow.com/a/37994066/8789361