Search code examples
hivehadoop-yarnazure-hdinsightambari

vertex failed. Out of memory error in Azure HDINSIGHT hive


I am experiencing outofmemory issue while joining 2 datasets; one contains 39M rows other contain 360K rows.

I have 2 worker nodes, each of the worker node has maximum memory of 125 GB.

In Yarn Memory allocated for all YARN containers on a node = 96GB

Minimum Container Size (Memory) = 3072

In Hive settings :

hive.tez.java.opts=-Xmx2728M -Xms2728M -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB

hive.tez.container.size=3410

What values I should set to get rid of outofmemory issue.


Solution

  • I solved it by using increasing the Yarn Memory allocated Minimum Container Size (Memory) = 3072 to 3840 Memory allocated for all YARN containers on a node 96 to 120 GB ( each node had 120GB)

    Percentage of physical CPU allocated for all containers on a node 80%

    Number of virtual cores 8

    https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-hive-out-of-memory-error-oom