Search code examples
javamapreducehdfsnio

mapreduce using NIO on HDFS


I'm need to put some text into an HDFS file from (all) the Mappers of a map reduce process.

The text / file is be used as a lookup in the reducers process so it cannot travel in the regular path (context.write())

Using the below snippet is both slow and may produce file lock issues when activated from the different mappers.

I would love to use ByteBuffer and file locks (NIO). is this possible in this framework ? Also, Any other ideas is welcome.

The code snippet:

Path fname = ...
FileSystem fs = FileSystem.get(context.getconfiguration());    
out = fs.create(fname);
while (condition) out.write(...);
out.flush();
out.close();

Thanks for any idea / help.

Raz


Solution

  • I would love to use ByteBuffer and file locks (NIO). is this possible in this framework ?

    No, this is not possible to do on HDFS.

    Also, Any other ideas is welcome.

    Take a look at how MapReduce creates its output. There is a file for every mapper or reducer. You will have to write your intermediate file in this way.

    part-r-00001
    part-r-00002
    part-r-..n..
    

    The text / file is be used as a lookup in the reducers process so it cannot travel in the regular path (context.write())

    Keep in mind that this will probably be very inefficient in terms of network communication between nodes. Especially if you are writing a large lookup file.