Search code examples
javahadoopmapreducepartitioner

Outputting single file for partitioner


Trying to get as many reducer as the no of keys

public class CustomPartitioner extends Partitioner<Text, Text>
{
    public int getPartition(Text key, Text value,int numReduceTasks)
   {
        System.out.println("In CustomP");
       return (key.toString().hashCode()) % numReduceTasks;
   }
} 

Driver class

job6.setMapOutputKeyClass(Text.class);
job6.setMapOutputValueClass(Text.class);
job6.setOutputKeyClass(NullWritable.class);
job6.setOutputValueClass(Text.class);
job6.setMapperClass(LastMapper.class);
job6.setReducerClass(LastReducer.class);
job6.setPartitionerClass(CustomPartitioner.class);
job6.setInputFormatClass(TextInputFormat.class);
job6.setOutputFormatClass(TextOutputFormat.class);

But I am getting ootput in a single file.

Am I doing anything wrong


Solution

  • You can not control number of reducer without specifying it :-). But still there is no surety of getting all the keys on different reducer because you are not sure how many distinct keys you would get in the input data and your hash partition function may return same number for two distinct keys. If you want to achieve your solution then you'll have to know number of distinct keys in advance and then modify your partition function accordingly.