Search code examples
javaflushfileoutputstreamnfsnfsclient

Why java FileOutputStream's write() or flush() doesn't make NFS client really send data to NFS server?


My Java web application use NFS file system, I use FileOutputStream to open, write multiple chunks and then close the file.

From the profiler stats I found that stream.write(byte[] payload,int begin, int length) and even stream.flush() takes zero milliseconds. Only the method call stream.close() takes non-zero milliseconds.

It seems that java FileOutputStream's write() or flush() doesn't really cause NFS client to send data to NFS server. Is there any other Java class will make NFS client flush data in real time? or there is some NFS client tuning need to be done?


Solution

  • You are probably running into Unix client-side caching. There are lots of details here in the O'Reilly NFS book.

    But in short:

    Using the buffer cache and allowing async threads to cluster multiple buffers introduces some problems when several machines are reading from and writing to the same file. To prevent file inconsistency with multiple readers and writers of the same file, NFS institutes a flush-on-close policy: All partially filled NFS data buffers for a file are written to the NFS server when the file is closed.

    For NFS Version 3 clients, any writes that were done with the stable flag set to off are forced onto the server's stable storage via the commit operation.

    NFS cache consistency uses an approach called close-to-open cache consistency - that is, you have to close the file before your server (and other clients) get a consistent up-to-date view of the file. You are seeing the downsides of this approach, which aims to minimize server hits.

    Avoiding the cache is hard from Java. You'd need to set the file open() O_DIRECT flag if you're using Linux; see this answer for more https://stackoverflow.com/a/16259319/5851520, but basically it disables the client's OS cache for that file, though not the server's.

    Unfortunately, the standard JDK doesn't expose O_DIRECT. as discussed here: Force JVM to do all IO without page cache (e.g. O_DIRECT) - essentially, use JNI youself or use a nice 3rd party lib. I've heard good things about JNA: https://github.com/java-native-access/jna ...

    Alternatively, if you have control over the client mount point, you can use the sync mount option, as per NFS manual. It says:

    If the sync option is specified on a mount point, any system call that writes data to files on that mount point causes that data to be flushed to the server before the system call returns control to user space. This provides greater data cache coherence among clients, but at a significant performance cost.

    This could be what you're looking for.