My data file is:
Utsav Chatterjee Dangerous Soccer Coldplay 4
Rodney Purtle Awesome Football Maroon5 3
Michael Gross Amazing Basketball Iron Maiden 6
Emmanuel Ezeigwe Cool Pool Metallica 5
John Doe Boring Golf Linkin Park 8
David Bekham Godlike Soccer Justin Beiber 89
Abhishek Kumar Geek Cricket Abhishek Kumar 7
Abhishek Singh Geek Cricket Abhishek Kumar 7
I want to pass the column number as an argument while invoking the hadoop jar and i require the entire data set to be sorted based on that particular column in Descending order. I could do this easily in Ascending order by setting the required column as key in mapper output. However, I'm unable to accomplish this in Descending order.
My Mapper and Reducer code is:
public static class Map extends Mapper<LongWritable,Text,Text,Text>{
public static void map(LongWritable key, Text value, Context context)
throws IOException,InterruptedException
{
Configuration conf = context.getConfiguration();
String param = conf.get("columnRef");
int colref = Integer.parseInt(param);
String line = value.toString();
String[] parts = line.split("\t");
context.write(new Text(parts[colref]), value);
}
}
public static class Reduce extends Reducer<Text,Text,Text,Text>{
public void reduce(Text key, Iterable<Text> value, Context context)
throws IOException,InterruptedException
{
for (Text text : value) {
context.write(text,null );
}
}
}
My comparator class is:
public static class sortComparator extends WritableComparator {
protected sortComparator() {
super(LongWritable.class, true);
// TODO Auto-generated constructor stub
}
@Override
public int compare(WritableComparable o1, WritableComparable o2) {
LongWritable k1 = (LongWritable) o1;
LongWritable k2 = (LongWritable) o2;
int cmp = k1.compareTo(k2);
return -1 * cmp;
}
}
I'm probably doing something wrong with the comparator. Can anyone help me out here? When I run this, picking column with index 5 (the numeric last column) to be the basis for this sort, I still get my result in ascending order.
Driver class:
public static void main(String[] args) throws Exception {
Configuration conf= new Configuration();
conf.set("columnRef", args[2]);
Job job = new Job(conf, "Sort");
job.setJarByClass(Sort.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setSortComparatorClass(DescendingKeyComparator.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
outputPath.getFileSystem(conf).delete(outputPath);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
Any advise on how may be able to achieve this task (descending order) will be very helpful for me!! Thanks
In your driver class, the following line of code:
job.setSortComparatorClass(DescendingKeyComparator.class);
You have set class as DescendingKeyComparator.class. Set it to sortComparator.class instead. It should work.