Search code examples
javahadoophadoop-partitioning

What two different keys go to the same reducer by the default hash partitioner in Hadoop?


As we know that Hadoop guarantees that the same keys which come from different mappers will be sent to the same reducer.

But if two different keys have the same hash value, they definitely will go to the same reducer, so will them be sent to the same reduce method orderly? Which part is responsible for this logic?

Thanks a lot!


Solution

  • Q1:so will them be sent to the same reduce method orderly

    Ans : yes


    Q2:Which part is responsible for this logic

    Ans : shuffle sort


    Example :

    key  value
     1       2
     1       2
     2       5
     3       19
     6       20
    

    Lets say number of reducer is 5 so now .

    Reduce 0 will get key NO key-value pairs
    Reduce 1 will get key 1,6 in same order 
    Reduce 2 will get key 2
    Reduce 3 will get key 3
    Reduce 4 will get key NO key-value pairs