I would like to run a TeraSort
based benchmark test on Hadoop
cluster. Script is working, firstly it is in running state, but after a few minutes it stuck in Accepted
state and FinalStatus
Undefined. I have thought, it is maybe a resource problem so I have modified yarn-site.xml
like above.
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
<description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
Same problem again. You can see also some graphs above about this process. It stops, when job's progress bar is on ~9-15 %.
Please verify the values set for these parameters too:
yarn.scheduler.maximum-allocation-mb, mapreduce.map.memory.mb, mapreduce.map.java.opts, mapreduce.reduce.memory.mb, mapreduce.reduce.java.opts
Start with yarn.scheduler.minimum-allocation-mb to be 512MB. This will help the scheduler to allocate memory to the tasks in smaller increments.
Update 1: This link would help understand a few things: https://www.mapr.com/blog/best-practices-yarn-resource-management
Also, set the input split size as appropriate to your environment. For optimum read performance, your input split size and block size should be same.