Search code examples
hadoopmapreducehivekerberosoozie

Connecting to HIVE in Mapper with Kerberos security


My goal is to run MapReduce with connection to Hive in Oozie workflow scheduler on secured cluster(kerberos) HDP2.3.

I'm able to connect to hive in beeline or when I run it as java app(yarn jar) with the following connection string:

DriverManager.getConnection("jdbc:hive2://host:10000/;principal=hive/_HOST@REALM", "", "");

But when I run it in Mapper it's failed.

 ERROR [main] org.apache.thrift.transport.TSaslTransport: SASL negotiation failure
    javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
        at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
        at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
        at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
        at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
        at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:190)
        ...
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
        at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
        at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
        at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)

How to make it working in Mapper?


Solution

  • It work with hive delegation tokens:

    1. oozie

      • add properties:

        hive2.server.principal=hive/_HOST@REALM
        hive2.jdbc.url=jdbc:hive2://{host}:10000/default
        
      • Set credentials to hive2

      • Mapper Example:

        public class HiveMapperExample extends Mapper<LongWritable, Text, Text, Text> {
        
            @Override
            protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
                try {
                    Class.forName("org.apache.hive.jdbc.HiveDriver");
                    Connection connect = DriverManager.getConnection("jdbc:hive2://{host}:10000/;auth=delegationToken", "", "");
                    Statement state = connect.createStatement();
                    ResultSet resultSet = state.executeQuery("select * from some_table");
                    while (resultSet.next()) {
                        ...
                    }
                } catch (Exception e) {
                    ...
                }
            }
         }
        
    2. ToolRunner

      public class HiveTestApplication extends Configured implements Tool {
      
          public static void main(String[] args) throws Exception {
              System.exit(ToolRunner.run(new HiveTestApplication(), args));
          }
      
          @Override
          public int run(String[] args) throws Exception {
              Configuration conf = new Configuration();
              //set your conf
              Job job = Job.getInstance(conf);
              job.setMapperClass(HiveMapperExample.class);
      
              addHiveDelegationToken(job.getCredentials(), "jdbc:hive2://{host}:10000/", "hive/_HOST@REALM");
      
              job.waitForCompletion(true);
      
              return 0;
          }
      
      
          public void addHiveDelegationToken(Credentials creds, String url, String principal) throws Exception {
              Class.forName("org.apache.hive.jdbc.HiveDriver");
      
              Connection con = DriverManager.getConnection(url + ";principal=" + principal);
              // get delegation token for the given proxy user
              String tokenStr = ((HiveConnection) con).getDelegationToken(UserGroupInformation.getCurrentUser().getShortUserName(), principal);
              con.close();
      
              Token<DelegationTokenIdentifier> hive2Token = new Token<>();
              hive2Token.decodeFromUrlString(tokenStr);
              creds.addToken(new Text("hive.server2.delegation.token"), hive2Token);
              creds.addToken(new Text(HiveAuthFactory.HS2_CLIENT_TOKEN), hive2Token);
          }
      }