Search code examples
ubuntuhadoophivehadoop-yarn

Hive getting stuck for queries, what can be the issue?


I have installed Hive and it executes basic queries properly, but is getting stuck for queries like distinct. On going to the link for checking the problem it displays

ACCEPTED: waiting for AM container to be allocated, launched and register with RM.

as the YarnApplicationStage.

I have tried making changes in the yarn-site.xml by increasing the percentage of YARN scheduler capacity too. But nothing is working. It still seems to be stuck at the same step. I have attached the yarn-site.xml code, Application screenshot and also the hive screenshot. Also there are no unhealthy nodes and the screenshot of the same has been attached.

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>mapreduce.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>1.0</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>   
</configuration>

The error image as the following order,

Hive stuck error

Application Manager

Nodes details

Yarn Node Manager Log hive stuck error

Application manager

Nodes details

[Yarn Nodemanager Log4 [Yarn Node Manager Log5


Solution

  • Your cluster metrics show you have no vcores, no memory and no active nodes so that means you cannot do any processing until you add a Node Manager to your cluster.

    The Resource Manager is the scheduler but the Node Managers provide the compute resources.

    The reason you see this when doing a DISTINCT is that Hive can process some queries without triggering a MapReduce job.

    References