How can I distribute Reduce job (multi reduce jobs) in Yarn (Hadoop 2.2.0)

I have been used HADOOP 1.2.1 server, and execute many pig jobs there. And recently, I considered to change my Hadoop server to HADOOP 2.2.0. So I tried some pig jobs in HADOOP 2.2.0, as I did in HADOOP 1.2.1 version.

But one thing I hardly understand in YARN MR2, is that Only ONE reduce job scheduled in every mr job.

At first time, I think that ok, reduce is faster than mr1, because Resource manager efficiently scheduled reduce job by handling it in only one server.

But in every big size mr job, YARN MR2 allocate Only ONE Reduce job scheduled every time.

Below is the Extream case.

My Old HADOOP(version 1.2.1) server is consist of 1 jobtracker and 2 tasktracker. (each 4-core, 32G)

Kind    Total Tasks(successful+failed+killed)   Successful tasks    Failed tasks    Killed tasks    Start Time  Finish Time
Setup   1   1   0   0   27-Jan-2014 18:01:45    27-Jan-2014 18:01:46 (0sec)
Map 2425    2423    0   2   27-Jan-2014 18:01:26    27-Jan-2014 19:08:58 (1hrs, 7mins, 31sec)
Reduce  166 163 0   3   27-Jan-2014 18:04:35    27-Jan-2014 20:40:15 (2hrs, 35mins, 40sec)
Cleanup 1   1   0   0   27-Jan-2014 20:40:16    27-Jan-2014 20:40:17 (1sec)

It takes 2 hour and 38 minute.

My New HADOOP(version 2.2.0) server is consist of 1 Resource Manager and 8 Node manager.(each 4-core, 32G) (New system is much better)

Job Name:   PigLatin:DefaultJobName
User Name:  hduser
Queue:  default
State:  SUCCEEDED
Uberized:   false
Started:    Tue Jan 28 16:09:41 KST 2014
Finished:   Tue Jan 28 21:47:45 KST 2014
Elapsed:    5hrs, 38mins, 4sec
Diagnostics:    
Average Map Time    41sec
Average Reduce Time 3hrs, 48mins, 23sec
Average Shuffle Time    1hrs, 36mins, 35sec
Average Merge Time  1hrs, 27mins, 38sec
ApplicationMaster
Attempt Number   Start Time  Node    Logs
1   Tue Jan 28 16:09:39 KST 2014    awdatanode2:8042    logs
Task Type    Total   Complete
Map 1172    1172
Reduce  1    1
Attempt Type     Failed  Killed  Successful
Maps    0   1   1172
Reduces 0   0   1

It takes 5 hour and 38 minutes.

Although My Old Hadoop server has poor resouce, It's much faster than New Hadoop. because reduce jobs distributed. On the other end, HADOOP 2.2.0 server has rich resources, and, map was much faster than old system, but the reduce takes terribly long time.

Hadoop 2.2 memory configured as Map (4G, heap space 3G) and Reduce (8G, heap space 6G). and I tried various configurations set. but result was always one reduce job.

So I examined the pig source code.

The reason My Pig job always make One reduce job is that the InputSizeReducerEstimator class cannot access the hdfs file system.

// line 79 of InputSizeReducerEstimator.java List poLoads = PlanHelper.getPhysicalOperators(mapReduceOper.mapPlan, POLoad.class);

the result poLoads always 0 size.

so my reduce job always estimated to one.

Solution

I solve this problem by rebuild pig-0.12.1-h2.jar build.

I asked pig user group... and they patched at

https://issues.apache.org/jira/browse/PIG-3512