Search code examples
hadoopdistributedkerberosdistributed-computing

Server returns 403 during secondary namenode docheckpoint with namenode


I am configuring hadoop on clusters.

All node started successfully, but secondary node failed doCheckpoint with following log:

2011-10-25 11:09:07,207 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint: 
2011-10-25 11:09:07,208 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.io.IOException: Server returned HTTP response code: 403 for URL: https://name.node.http:50470/getimage?getimage=1
    at sun.reflect.GeneratedConstructorAccessor24.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491)
    at java.security.AccessController.doPrivileged(Native Method)
    at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)
    at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:183)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$3.run(SecondaryNameNode.java:364)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$3.run(SecondaryNameNode.java:353)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.downloadCheckpointFiles(SecondaryNameNode.java:353)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:438)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:329)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:288)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:337)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1110)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:285)
    at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Server returned HTTP response code: 403 for URL: https://name.node.http:50470/getimage?getimage=1
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
    at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2308)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getHeaderField(HttpsURLConnectionImpl.java:271)
    at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:175)
    ... 14 more

Seems namenode rejects request of secondarynode with http error code 403.

Kerberos is configured with hadoop, and auth is passed by namenode to accept the request of secondary namenode:

2011-10-25 11:27:40,033 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successfull for hadoop/[email protected]
2011-10-25 11:27:40,100 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successfull for hadoop/[email protected] for protocol=interface org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol
2011-10-25 11:27:40,101 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 123.58.169.92

Does anyone know how could that happen? How can I fix it?

Thanks very much.


Solution

  • I think it's more appropriate to move my comment above to here as an answer.

    This error is because of the _HOST macro setting of secondary namemode principal in hdfs-site.xml, if there is no dfs.secondary.http.address set in hdfs-site.xml, the _HOST will be translated by the one who use it.

    I this case, code runs in namenode, so, _HOST parsed to namenode address, since kerberos principal composed of name, hostname, realm, that's a different principal, that's why authentication failed.