Search code examples
javabufferniochannelmemory-mapped-files

Java NIO - Memory mapped files


I recently came across this article which provided a nice intro to memory mapped files and how it can be shared between two processes. Here is the code for a process that reads in the file:

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

public class MemoryMapReader {

 /**
  * @param args
  * @throws IOException 
  * @throws FileNotFoundException 
  * @throws InterruptedException 
  */
 public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException {

  FileChannel fc = new RandomAccessFile(new File("c:/tmp/mapped.txt"), "rw").getChannel();

  long bufferSize=8*1000;
  MappedByteBuffer mem = fc.map(FileChannel.MapMode.READ_ONLY, 0, bufferSize);
  long oldSize=fc.size();

  long currentPos = 0;
  long xx=currentPos;

  long startTime = System.currentTimeMillis();
  long lastValue=-1;
  for(;;)
  {

   while(mem.hasRemaining())
   {
    lastValue=mem.getLong();
    currentPos +=8;
   }
   if(currentPos < oldSize)
   {

    xx = xx + mem.position();
    mem = fc.map(FileChannel.MapMode.READ_ONLY,xx, bufferSize);
    continue;   
   }
   else
   {
     long end = System.currentTimeMillis();
     long tot = end-startTime;
     System.out.println(String.format("Last Value Read %s , Time(ms) %s ",lastValue, tot));
     System.out.println("Waiting for message");
     while(true)
     {
      long newSize=fc.size();
      if(newSize>oldSize)
      {
       oldSize = newSize;
       xx = xx + mem.position();
       mem = fc.map(FileChannel.MapMode.READ_ONLY,xx , oldSize-xx);
       System.out.println("Got some data");
       break;
      }
     }   
   }

  }

 }

}

I have, however, a few comments/questions regarding that approach:

If we execute the reader only on an empty file, i.e run

  long bufferSize=8*1000;
  MappedByteBuffer mem = fc.map(FileChannel.MapMode.READ_ONLY, 0, bufferSize);
  long oldSize=fc.size();

This will allocate 8000 bytes which will now extend the file. The buffer that this returns has a limit of 8000 and a position of 0, therefore, the reader can proceed and read empty data. After this happens, the reader will stop, as currentPos == oldSize.

Supposedly now the writer comes in (code is omitted as most of it is straightforward and can be referenced from the website) - it uses the same buffer size, so it will write first 8000 bytes, then allocate another 8000, extending the file. Now, if we suppose this process pauses at this point, and we go back to the reader, then the reader sees the new size of the file and allocates the remainder (so from position 8000 until 1600) and starts reading again, reading in another garbage...

I am a bit confused whether there is a why to synchronize those two operations. As far as I see it, any call to map might extend the file with really an empty buffer (filled with zeros) or the writer might have just extended the file, but has not written anything into it yet...


Solution

  • There are several ways.

    1. Let the writer acquire an exclusive Lock on the region that has not been written yet. Release the lock when everything has been written. This is compatible to every other application running on that system but it requires the reader to be smart enough to retry on failed reads unless you combine it with one of the other methods

    2. Use another communication channel, e.g. a pipe or a socket or a file’s metadata channel to let the writer tell the reader about the finished write.

    3. Write at a position in the file a special marker (being part of the protocol) telling about the written data, e.g.

      MappedByteBuffer bb;
      …
      // write your data
      
      bb.force();// ensure completion of all writes
      bb.put(specialPosition, specialMarkerValue);
      bb.force();// ensure visibility of the marker