hadoop mapreduce hadoop2 hashcode reducers

Hadoop's default partitioner: HashPartitioner - How it calculates hash-code of a key?

I am trying to understand partitioning in MapReduce and I came to know that Hadoop has a default partitioner known as HashPartitioner, and partitioner helps in deciding to which reducer a given key would go to.

Conceptually, it works like this:

hashcode(key) % NumberOfReducers, where `key` is the key in <key,value> pair.

My question is:

How does HashPartitioner calculate the hash-code for the key? Does is simply call the hashCode() of the key or does this HashPartitioner use some other logic to calculate the hash-code of the key?

Can anyone help me understand this?

Solution

The default partitioner simply use hashcode() method of key and calculate the partition. This give you an opportunity to implement your hascode() to tweak the way the keys are going to be partitioned.

From the javadoc:

public int getPartition(K key,
               V value,
               int numReduceTasks)
Use Object.hashCode() to partition.

For the actual code, it simply returns (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks:

 public int More ...getPartition(K key, V value,
                          int numReduceTasks) {
    return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
 }

EDIT: Adding detail on custom partitioner

You can add a different logic for partitioner, which even may not use hashcode() at all.

As custom partitioner can be written by extending Partitioner

public class CustomPartitioner extends Partitioner<Text, Text>

One such example, which works on the properties of the custom key object:

public static class CustomPartitioner extends Partitioner<Text, Text>{
@Override
public int getPartition(Text key, Text value, int numReduceTasks){
    String emp_dept = key.getDepartment();
    if(numReduceTasks == 0){
        return 0;
    }

    if(key.equals(new Text(“IT”))){
        return 0;
    }else if(key.equals(new Text(“Admin”))){
        return 1 % numReduceTasks;
    }else{
        return 2 % numReduceTasks;
    }
}