Search code examples
kuberneteshadoophdfsapache-zookeeper

java.lang.IllegalArgumentException: Does not contain a valid host:port authority: http at org.apache.hadoop.net.NetUtils.createSocketAddr


Note that i have deployed statefulsets of 2 namenodes, 2 datanodes and 3 journalnodes for Apache Hadoop 3.3.3 HA on kubernetes.

but namenode is throwing the following error.

 $ hdfs --config /opt/hadoop/etc/hadoop namenode

{"name":"org.apache.hadoop.hdfs.server.namenode.NameNode","time":1659593176018,"date":"2022-08-04 06:06:16,018","level":"ERROR","thread":"Listener at 0.0.0.0/8020","message":"Error encountered requiring NN shutdown. Shutting down immediately.","exceptionclass":"java.lang.IllegalArgumentException","stack":["java.lang.IllegalArgumentException: **Does not contain a valid host:port authority: http:**","\tat org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:232)","\tat org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:189)","\tat org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:169)","\tat org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)","\tat org.apache.hadoop.hdfs.DFSUtil.substituteForWildcardAddress(DFSUtil.java:1046)","\tat org.apache.hadoop.hdfs.DFSUtil.getInfoServerWithDefaultHost(DFSUtil.java:1014)","\tat org.apache.hadoop.hdfs.server.namenode.ha.RemoteNameNodeInfo.getRemoteNameNodes(RemoteNameNodeInfo.java:61)","\tat org.apache.hadoop.hdfs.server.namenode.ha.RemoteNameNodeInfo.getRemoteNameNodes(RemoteNameNodeInfo.java:42)","\tat org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.<init>(EditLogTailer.java:191)","\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startStandbyServices(FSNamesystem.java:1501)","\tat org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startStandbyServices(NameNode.java:2051)","\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.enterState(StandbyState.java:69)","\tat org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1024)","\tat org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:995)","\tat org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)","\tat org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)"]} core-site.xml

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://apache-hadoop-namenode:8020</value>
</property>
<property>
    <name>ha.zookeeper.quorum</name>
    <value>zk-headless.backend.svc.cluster.local:2181</value>
</property>
<property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/dfs/journal</value>
</property>

hdfs-site.xml

<property>
    <name>dfs.nameservices</name>
    <value>apache-hadoop-namenode</value>
</property>

<property>
    <name>dfs.ha.namenodes.apache-hadoop-namenode</name>
    <value>apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local,apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local</value>
</property>

<property>
    <name>dfs.namenode.rpc-address.apache-hadoop-namenode.apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local</name>
    <value>hdfs://apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local:8020</value>
</property>

<property>
    <name>dfs.namenode.rpc-address.apache-hadoop-namenode.apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local</name>
    <value>hdfs://apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local:8020</value>
</property>

<property>
    <name>dfs.namenode.http-address.apache-hadoop-namenode.apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local</name>
    <value>http://apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local:9870</value>
</property>
<property>
    <name>dfs.namenode.http-address.apache-hadoop-namenode.apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local</name>
    <value>http://apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local:9870</value>
</property>

<property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://apache-hadoop-journalnode.backend.svc.cluster.local:8485/apache-hadoop-namenode</value>
</property>  

Solution

  • The solution for this is to remove the http in the following property from hdfs-site.xml

    <property>
      <name>dfs.namenode.http-address.apache-hadoop-namenode.apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local</name>
      <value>apache-hadoop-namenode-0.apache-hadoop-namenode.backend.svc.cluster.local:9870</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.apache-hadoop-namenode.apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local</name>
      <value>apache-hadoop-namenode-1.apache-hadoop-namenode.backend.svc.cluster.local:9870</value>
    </property>
    

    this http address property is required as metioned it the https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#:~:text=dfs.namenode.http%2Daddress.%5Bnameservice%20ID%5D.%5Bname%20node%20ID%5D%20%2D%20the%20fully%2Dqualified%20HTTP%20address%20for%20each%20NameNode%20to%20listen%20on
    but for my case it worked after removing the http:// out of this property.