"a reducer is different than a reduce task. A reducer can run multiple reduce tasks". Can someone explain this with the below example?
foo.txt: Sweet, this is the foo file bar.txt: This is the bar file
and I am using 2 reducers. What are the reduce tasks and based on what multiple reduce tasks are generated in a reducer?
Reducer is a class, which contain reduce function as below
protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
) throws IOException, InterruptedException {
Reduce task is program running on a node, which is executing reduce function of Reducer class.
You can think Reduce task as an instance of Reducer
Have a look at Apache MapReduce tutorial page for more details ( Payload section).