Search code examples
hivehiveqlapache-tez

Hive query throw "code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask" exception when query has GROUP BY cluase


I have Hive + LLAP on HDP 3.1.4

Hive and Tez Config is:

yarn.nodemanager.resource.memory-mb = 40960
yarn.scheduler.minimum-allocation-mb = 1024
yarn.scheduler.maximum-allocation-mb = 40960
hive.tez.container.size = 4096
num_llap_nodes=4
hive.llap.daemon.num.executors=8
hive.llap.daemon.yarn.container.mb = 35840
llap_headroom_space=2048
llap_heap_size=32768
hive.llap.io.memory.size=1024
tez.am.resource.memory.mb=4096
hive.tez.java.opts=-server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps **-Xmx3276m**
tez.runtime.io.sort.mb= 1638
tez.runtime.unordered.output.buffer.size-mb=409

The following query runs properly:

select count(*) from balance;

but when use group by expression in the following query:

select count(*),jobdate from balance group by jobdate;

I I've tried many configurations but this long exception is thrown:

ERROR: Error while processing statement: **FAILED: Execution Error, 
    return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask.** 
    Vertex failed, vertexName=Map 1, 
    vertexId=vertex_1617520101397_0014_1_00, diagnostics=[Task 
    failed, taskId=task_1617520101397_0014_1_00_000013, 
    diagnostics=[TaskAttempt 0 failed, **info=[Error: Error while 
    running task ( failure ) : java.lang.NoClassDefFoundError: Could 
    not initialize class 
    org.apache.tez.runtime.library.api.TezRuntimeConfiguration**    at 

    **BLABLA**
        at java.lang.Thread.run(Thread.java:748) ]], Task failed, 
    taskId=task_1617520101397_0014_1_00_000006, 
    diagnostics=[TaskAttempt 0 failed, info=[Error: Error while 
    running task ( failure ) : java.lang.NoClassDefFoundError: Could 
    not initialize class 
    org.apache.tez.runtime.library.api.TezRuntimeConfiguration  at 
        at java.lang.Thread.run(Thread.java:748) ]], Task failed, 
    taskId=task_1617520101397_0014_1_00_000005, 
    diagnostics=[TaskAttempt 0 failed, info=[Error: Error while 
    running task ( failure ) : java.lang.NoClassDefFoundError: Could 
    not initialize class 
    org.apache.tez.runtime.library.api.TezRuntimeConfiguration  at 
    org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:111) 
    java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
        at java.lang.Thread.run(Thread.java:748) ]], **Vertex did not 
    succeed due to OWN_TASK_FAILURE, failedTasks:9 killedTasks:31761, 
    Vertex vertex_1617520101397_0014_1_00 [Map 1] killed/failed due 
    to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, 
    vertexId=vertex_1617520101397_0014_1_01, diagnostics=[Vertex 
    received Kill while in RUNNING state., Vertex did not succeed due 
    to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:18, Vertex 
    vertex_1617520101397_0014_1_01 [Reducer 2] killed/failed due 
    to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
    VERTEX_FAILURE. failedVertices:1 killedVertices:1 Error Code: 2**

Solution

  • There are two sections for set hive.tez.container.size in Ambari Hive Config page. One of them appears in the SETTINGS tab and the other that has related to LLAP goes under the Advanced hive-interactive-site in the ADVANCED tab. I was trying with hive.tez.container.size value the SETTINGS tab instead of Advanced hive-interactive-site section. Finally, I set the following configs and the error solved:

    set hive.tez.container.size=10240;
    set hive.tez.java.opts=-Xmx9216m;
    set tez.runtime.io.sort.mb=3072;
    set tez.runtime.unordered.output.buffer.size-mb=1024;