I am reading multiple input files for a word count problem.
Example file names: file1.txt file2.txt file3.txt
I am able to get the word count but what should be added if I also want to get the file names along with count where the words exist.
for an example,
Contents of file 1: welcome to Hadoop
Contents of file 2: This is hadoop
Current output :
Hadoop 2
Is 1
This 1
To 1
Welcome 1
Expected output:
Hadoop 2 File01.txt File02.txt
Is 1 File02.txt
This 1 File02.txt
To 1 File01.txt
Welcome 1 File01.txt
1st do a input a split
String file = ((FileSplit)inputSplit).getPath().getName();
and collect word and filename from mapper as output.
In the reducer count the file name against the key and increment the counter and keep appending the file name.
file += filename;
textString = counter + file;
output.collect(key,new Text(textString));
This solved the problem.