I am using hadoop core 0.20.2 and am running into an issue with incompatible types when trying to set the input format for my job. I'm just trying to get a simple wordcount program running.
Here is my main method:
public static void main(String[] args) throws Exception{
JobConf conf = new JobConf(Wordcount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
On the line conf.setInputFormat(TextInputFormat.class);
I am getting an error incompatible types class<TextInputFormat> cannot be converted to Class<? extends InputFormat>
When I take a look at the setInputFormat method I see:
public void setInputFormat(Class<? extends InputFormat> theClass) {
}
While I'm not 100% sure what Class<? extends InputFormat> theClass
means I gather I must pass a class which extends InputFormat. Please let me know if I am on the wrong track.
So when I take a look at the TextInputFormat class I see:
public class TextInputFormat extends FileInputFormat<LongWritable, Text>
So I'm passing a class which extends FileInputFormat and NOT InputFormat.
But I believe FileInputFormat extends InputFormat because I see in the declaration
public abstract class FileInputFormat<K extends Object, V extends Object> extends InputFormat<K, V>
Am I correct on why I am getting this error? Or am I completely wrong and it is valid to pass a class which extends the correct class to any nth degree?
I am fairly new to Java and even newer to Hadoop. I want to note that I am also getting errors on the lines
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
which read as "incompatible types: JobConf cannot be converted to Job". I am aware 0.20.2 is not the latest version of Hadoop but I have to work with this version. In new versions of Hadoop I've come across other ways to create a Job Configuration and am starting to think I am having issues because I may be referencing classes which we added after 0.20.2.
I am kind of reading online resources to help get a working copy, but I never know what version is being used. So I may have mismatched code now. Any assistance would be greatly appreciated.
Take a look at the packages the classes come from. You should be using a set of packages that have "mapred" as one level or another set that have "mapreduce" as one level. I suspect that you're mixing packages and you need to use the TextInputFormat from the other package.