How does Mapper class identify the SequenceFile as inputfile in hadoop?

In my one MapReduce task, I override the BytesWritable as KeyBytesWritable, and override the ByteWritable as ValueBytesWritable. Then I output the result using SequenceFileOutputFormat.

My question is when I start the next MapReduce task, I want to use this SequenceFile as inputfile. So how could I set the jobclass, and how the Mapper class could identify the key and value in the SequenceFile which I overrided before?

I understand that I could SequenceFile.Reader to read the key and value.

Configuration config = new Configuration();
Path path = new Path(PATH_TO_YOUR_FILE);
SequenceFile.Reader reader = new SequenceFile.Reader(FileSystem.get(config), path, config);
WritableComparable key = (WritableComparable) reader.getKeyClass().newInstance();
Writable value = (Writable) reader.getValueClass().newInstance();
while (reader.next(key, value))

But I don't know how to use this Reader to pass the key and value into Mapper class as Parameters. How could I set conf.setInputFormat to SequenceFileInputFormat and then let Mapper get the key and values?

Thanks

Solution

You do not need to manually read the sequence file. Just set the input format class to sequence file:

job.setInputFormatClass(SequenceFileInputFormat.class);

and set the input path to the directory containing yor sequence files.

FileInputFormat.setInputPaths(<path to the dir containing your sequence files>);

You will need to pay attention to the (Key,Value) types of the inputs on the parameterized types of your Mapper class to match the (key,value) tuples inside your sequence file.