Search code examples
hadoophdfskerberosdistcp

Transfer of files from unsecured hdfs to secured hdfs cluster


I wanted to transfer files from unsecured HDFS cluster to kerberized cluster. I am using distcp to transfer the files. I have used the following command.

hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://<ip>:8020/<sourcedir> hdfs://<ip>:8020/<destinationdir>

I am getting the following error after I executed the above command in the kerberized cluster.

java.io.EOFException: End of File Exception between local host is: "<xxx>"; destination host is: "<yyy>; : java.io.EOFException; For more details see:  http://wiki.apache.org/hadoop/EOFException

Solution

  • this is error because:

    cluster is blocked for RPC communication, in such cases, webhdfs protocol can be used, so above distcp can be rewritten as

    hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://xxx:8020/src_path webhdfs://yyy:50070/target_path
    

    this is very good blog post for distcp