Search code examples
hivehadoop-yarnkryomapr

Hive - GenericUDTF - runQuery fails due to kryo stackoverflow exception


HiveServer2 ; Hive version: 1.2

I'm trying to run a query which contains a custom UDF class (that implements GenericUDTF).

The UDF class contains a tree object which it uses for its calculations.

When the tree is small, the query runs properly. but when the tree grows, the query is failing with the following error:

org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. null at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:155) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:70) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:205) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:217) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.StackOverflowError at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.writeName(DefaultClassResolver.java:90) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:81)

Any idea how to solve this? Any magic properties in hive configuration?


Solution

  • It seems like the issue is realted to https://github.com/EsotericSoftware/kryo/issues/103

    My workaround was to init the tree object in run-time (in process() API) instead of init time (in initialize() API)