I have some questions about map reduce output part files.
Normally, part-r-* comes from the reducer. MultipleOutputs
allows you to use a different naming convention. If there is no reduce step, the output will be part-m-*. As I understand it, if there is a reducer defined, the mapper outputs are deleted regardless of if the reducers produce anything. Usually the reducer output files will be produced as well even if they are empty, unless you use LazyOutputFormat
. Where did you find part-* files that did not end with either m-nnnnn or r-nnnnn ?