Search code examples
hadoopmapreducecascading

How do I force a reducer in cascading?


To gain some of the benefits only possible with reducers and not mappers.


Solution

  • Found my answer in Google Groups:

    Use a GroupBy which invariable will use a reducer when performing the grouping:

    previousPipe = new GroupBy(previousPipe); //this does a Group on Fields.All
    

    Reducing Number of Files (another alternative with less coupling)

    previousPipe = new GroupBy(previousPipe, new Fields("rand"), 
                   new RandonNumGen(Fields.Args));
    

    Where new RandonNumGen(Fields.Args) is a function you build from Function to create a new Fields("rand") that creates temporary random number (temporary meaning you drop the field later).

    For more information check this Google Groups thread: