Search code examples
authenticationhadoophdfskerberoskerberos-delegation

Connect to HDFS using ticket cache instead of keytab file


I have two clusters(cluster 1 and cluster 2) and both are secured with kerberos authentication. I can only read the data from both clusters and cannot change configuration files on any of these clusters.

I can access one cluster using keytab file and the user id that is already generated and have a keytab file.

So, I know how to access the HDFS using keytab file and java on cluster 1.

Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://cloudera:8020");
conf.set("hadoop.security.authentication", "kerberos");

UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("hdfs@CLOUDERA", "/etc/hadoop/conf/hdfs.keytab");

FileSystem fs = FileSystem.get(conf);
FileStatus[] fsStatus = fs.listStatus(new Path("/"));
for (int i = 0; i < fsStatus.length; i++) {
    System.out.println(fsStatus[i].getPath().toString());
}

Problem: I do not have a keytab file on cluster 2 but i can use kinit and access HDFS from command line(hdfs dfs -l s/ works). However, I want to access HDFS programmatically.

Question: Is it possible in someway to access HDFS using java and not having keytab file? I think if i can use kinit from command line and access the cluster then there should be some way to access also using java without using keytab but kinit generated cache?

I tried to generate the keytab file using ktutil command but there are pre-authentication issues and i don't have right to make any changes in configuration files.


Solution

  • You can use ticket cache to login. You just need to add the cache path in UserGroupInformation information and that information will ne used.

    More specifically:

    1. Run: kinit -f -p -c /Path/where/your/cache/will/be/created username@your_pricipal
    2. Change the permission on the cache.
    3. export KRB5CCNAME="PATH to your cache"
    4. export HVR_HDFS_KRB_TICKETCACHE="PATH to your cache"

    Once it is done then it is time add the cache in the code:

    Add all the hadoop related configurations in your Configuration object and add the following two also:

     1. conf.set("hadoop.security.kerberos.ticket.cache.path","Path to your cache")
     2. conf.set("hadoop.security.token.service.use_ip","true")
    

    It will let you login. Make sure that you really export you cache as shown above. Use kdestroy to destroy the old cache and create new one.