Search code examples
apache-sparkhiveapache-spark-sqlbeeline

AuthorizationException: User not allowed to impersonate User


I wrote a spark job which registers a temp table and when I expose it via beeline (JDBC client)

$ ./bin/beeline
beeline> !connect jdbc:hive2://IP:10003 -n ram -p xxxx
0: jdbc:hive2://IP> show tables;
+---------------------------------------------+--------------+---------------------+
|                    tableName                          | isTemporary  |
+---------------------------------------------+--------------+---------------------+
| f238                                                        | true              |
+---------------------------------------------+--------------+---------------------+
2 rows selected (0.309 seconds)
0: jdbc:hive2://IP>

I can view the table. When querying I get this error message

0: jdbc:hive2://IP> select * from f238;
Error: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: ram is not allowed to impersonate ram (state=,code=0)
0: jdbc:hive2://IP>

I have this in hive-site.xml,

<property>
  <name>hive.metastore.sasl.enabled</name>
  <value>false</value>
  <description>If true, the metastore Thrift interface will be secured with SASL. Clients must authenticate with Kerberos.</description>
</property>

<property>
  <name>hive.server2.enable.doAs</name>
  <value>false</value>
</property>

<property>
  <name>hive.server2.authentication</name>
  <value>NONE</value>
</property>

I have this in core-site.xml,

<property>
  <name>hadoop.proxyuser.hive.groups</name>
  <value>*</value>
</property>

<property>
  <name>hadoop.proxyuser.hive.hosts</name>
  <value>*</value>
</property>

full log

ERROR [pool-19-thread-2] thriftserver.SparkExecuteStatementOperation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: ram is not allowed to impersonate ram
        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.runInternal(SparkExecuteStatementOperation.scala:259)
        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:182)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Any idea what configuration I am missing?


Solution

  • <property>
     <name>hive.server2.enable.doAs</name>
     <value>true</value>
    </property>
    

    Also if you want user ABC to impersonate all(*), add below properties to your core-site.xml

    <property>
      <name>hadoop.proxyuser.ABC.groups</name>
    <value>*</value>
    </property>
    
    <property>
     <name>hadoop.proxyuser.ABC.hosts</name>
     <value>*</value>
    </property>