Search code examples
hadoopmapreducehadoop2reducersbigdata

Difference between reduce task and a reducer


"a reducer is different than a reduce task. A reducer can run multiple reduce tasks". Can someone explain this with the below example?

foo.txt: Sweet, this is the foo file bar.txt: This is the bar file

and I am using 2 reducers. What are the reduce tasks and based on what multiple reduce tasks are generated in a reducer?


Solution

  • Reducer is a class, which contain reduce function as below

    protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
                            ) throws IOException, InterruptedException {
    

    Reduce task is program running on a node, which is executing reduce function of Reducer class.

    You can think Reduce task as an instance of Reducer

    Have a look at Apache MapReduce tutorial page for more details ( Payload section).