FileInputStream
uses this native method to read bytes in a user supplied buffer. As we can see in the implementation, a user-space buffer is indeed allocated with length max(BUF_SIZE, supplied buffer length)
. This buffer is then passed to IO_Read
. I could not pull up the implementation of this method but I am pretty sure this is the standard C buffered reading. Why is FileInputStream
then said to be unbuffered? And what additional buffering is BufferedReader
using?
Edit: Benchmark comparing read times of a 100M file and its results:
Benchmark Mode Cnt Score Error Units
StreamIOBenchmark.bisMultiBytes avgt 30 27.640 ± 1.117 ms/op
StreamIOBenchmark.bisSingleBytes avgt 30 400.552 ± 26.921 ms/op
StreamIOBenchmark.fisMultiBytes avgt 30 25.231 ± 1.459 ms/op
StreamIOBenchmark.fisSingleBytes avgt 30 97991.213 ± 5620.685 ms/op
The real difference is while reading single bytes, as FIS will be making a new system call for each byte. Hoewver, When reading 8k bytes at once, both FIS (fisMultiBytes
) and BIS (bisMultiBytes
) perform pretty similarly.
Is
FileInputStream
really unbuffered?
Yes. Really.
I could not pull up the implementation of this method but I am pretty sure this is the standard C buffered reading.
No it isn't. The native code is NOT using C buffered I/O. It is performing a read(2)
call into a temporary buffer. Then it copies the data from the temporary buffer to the caller supplied byte[]
.
In the Java 11u code I am looking at, the temporary buffer is either a small on-stack buffer or a larger native heap buffer that is malloc
'd and free
'd during the native
call.
The relative pathname in the OpenJDK Java 11u codebase is ./src/java.base/share/native/libjava/io_util.c
. Look for the readBytes
method. (This is called via a native entry point method in ./src/java.base/share/native/libjava/FileInputStream.c
)
The IO_Read
in io_utils.c
is a macro that aliases handleRead
which wraps the read(2)
call.
UPDATE
I just noticed that you had already found a link to the Java 8u version of io_util.c
. It behaves the same way as the 11u code that I described above. So maybe you just misread the C code you found?
Why is
FileInputStream
then said to be unbuffered?
It is "said to be" unbuffered because it really is unbuffered.
And what additional buffering is
BufferedReader
using?
You can see that by looking at the (Java) source code for BufferedReader
.
Hint: rather than making (incorrect) assumptions about how the JVM native code implementation works, download and read the OpenJDK source code. It is available for anyone with enough disk space (1.2GB) to download.