Hadoop YARN reducer/shuffle stuck

I was migrating from Hadoop 1 to Hadoop 2 YARN. Source code were recompiled using MRV2 jars and didn't have any compatibility issue. When I was trying to run the job under YARN, map worked fine and went to 100%, but reduce was stuck at ~6,7%. There's no performance issue. Actually, I checked CPU usage, it turned out when reduce was stuck, there seems like no computation going on because CPU is mostly 100% idle. The job can run successfully on Hadoop 1.2.1.

I checked the log messages from resourcemanager and found out that since map finished, no more container was allocated so there's no reduce is running on any container. What caused this situation?

I'm wondering if it is related to the yarn.nodemanager.aux-services property setting. By following the official tutorial(http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html), this property has to be set to mapreduce_shuffle which indicates that MR will still use default shuffle method instead of other shuffle plugins(http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html). I tried not to set this property but Hadoop wouldn't let me.

Here's the log of userlogs/applicationforlder/containerfolder/syslog when it's about to reach 7% of reduce. After that log didn't update anymore and reduce stopped as well.

2014-11-26 09:01:04,104 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1416988910568_0001_m_002988_0 decomp: 129587 len: 129591 to MEMORY
2014-11-26 09:01:04,104 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 129587 bytes from map-output for attempt_1416988910568_0001_m_002988_0
2014-11-26 09:01:04,104 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 129587, inMemoryMapOutputs.size() -> 2993, commitMemory -> 342319024, usedMemory ->342448611
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 about to shuffle output of map attempt_1416988910568_0001_m_002989_0 decomp: 128525 len: 128529 to MEMORY
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput: Read 128525 bytes from map-output for attempt_1416988910568_0001_m_002989_0
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 128525, inMemoryMapOutputs.size() -> 2994, commitMemory -> 342448611, usedMemory ->342577136
2014-11-26 09:01:04,105 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: datanode03:13562 freed by fetcher#1 in 13ms

Was this a common issue when migrating from Hadoop 1 to 2? Was the strategy of running map-shuffle-sort-reduce changed in Hadoop 2? What caused this problem? Thanks so much. Any comments will help!

Major environment setup:

Hadoop version: 2.5.2
6-node cluster with 8-core CPU, 15 GB memory on each node

Related properties settings:

yarn.scheduler.maximum-allocation-mb: 14336
yarn.scheduler.minimum-allocation-mb: 2500
yarn.nodemanager.resource.memory-mb: 14336
yarn.nodemanager.aux-services: mapreduce_shuffle
mapreduce.task.io.sort.factor: 100
mapreduce.task.io.sort.mb: 1024

Solution

Finally solved the problem after googling around and found out I posted this question three month ago already.

It's because of the data skew.