Search code examples
javahadoop2

How to remove -r-00000 from part-r-00000 in reducer output


In my MapReduce code, in the reducer side I am implementing MultipleOutputs and I have attached inputsplit to it. I want my output file name as my keyvalue and not as keyvalue-r-00000. How can I remove the trailing value. Here is my code in reducer side.

String last = map.lastKey();
        String tab2[] = last.split(",");
        String line1 = "[" + tab2[2] + "," + tab2[3] + "," + tab2[8] + "]" + "\n" + "];";
        text1.set(line1);
        multipleOutputs.write(NullWritable.get(), text1, generateFileName(key));
    }

    String generateFileName(Text key){
        return key.toString();      
    }

    @Override
    public void setup(Context con){
        multipleOutputs = new MultipleOutputs<NullWritable, Text>(con);
    }

    @Override
    public void cleanup(final Context context) throws IOException, InterruptedException{
        multipleOutputs.close();
    }
}

Solution

  • As per Javadoc of MultipleOutputs the parameters are as follow

    write(KEYOUT key, VALUEOUT value, String baseOutputPath)
    

    here your first parameter should be key

    so it would be

    multipleOutputs.write(key, text1, generateFileName(key));
    

    It is working for me.