To start with, I understand the concept of buffering as a wrapper around, for instance, FileInuptStream
to act as a temporary container for contents read(lets take read scenario) from an underlying stream, in this case - FileInputStream
.
read
method of BufferedInputStream
) has to make 100 reads(one byte at a time).FileInuptStream
is the ultimate source(though wrapped by BufferedInputStream
) of data(file which contains 100 bytes), wouldn't it has to read 100 times to read 100 bytes? Though, the code calls read
method of BufferedInputStream
but, the call is passed to read
method of FileInuptStream
which needs to make 100 read calls. This is the point which I'm unable to comprehend. IOW, though wrapped by a BufferedInputStream
, the underlying streams(such as FileInputStream
) still have to read one byte at a time. So, where is the benefit(not for the code which requires only two read calls to buffer but, to the application's performance) of buffering?
Thanks.
EDIT:
I'm making this as a follow-up 'edit' rather than 'comment' as I think its contextually better suits here and as a TL;DR for readers of chat between @Kayaman and me.
The read method of BufferedInputStream
says(excerpt):
As an additional convenience, it attempts to read as many bytes as possible by repeatedly invoking the read method of the underlying stream. This iterated read continues until one of the following conditions becomes true:
The specified number of bytes have been read, The read method of the underlying stream returns -1, indicating end-of-file, or The available method of the underlying stream returns zero, indicating that further input requests would block.
I digged into the code and found method call trace as under:
BufferedInputStream
-> read(byte b[])
As a I want to see buffering in action.BufferedInputStream
-> read(byte b[], int off, int len)
BufferedInputStream
-> read1(byte[] b, int off, int len)
- privateFileInputStream
-
read(byte b[], int off, int len)FileInputStream
-> readBytes(byte b[], int off, int len)
- private and native. Method description from source code - Reads a subarray as a sequence of bytes.
Call to read1
(#4, above mentioned) in BufferedInputStream
is in an infinite for
loop. It returns on conditions mentioned in above excerpt of read
method description.
As I had mentioned in OP(#6), the call does seem to handle by an underlying stream which matches API method description and method call trace.
The question still remains, if native API call - readBytes
of FileInputStream
reads one byte at a time and create an array of those bytes to return?
The underlying streams(such as
FileInputStream
) still have to read one byte at a time
Luckily no, that would be hugely inefficient. It allows the BufferedInputStream
to make read(byte[8192] buffer)
calls to the FileInputStream
which will return a chunk of data.
If you then want to read a single byte (or not), it will efficiently be returned from BufferedInputStream's
internal buffer instead of having to go down to the file level. So the BI
is there to reduce the times we do actual reads from the filesystem, and when those are done, they're done in an efficient fashion even if the end user wanted to read just a few bytes.
It's quite clear from the code that BufferedInputStream.read()
does not delegate directly to UnderlyingStream.read()
, as that would bypass all the buffering.
public synchronized int read() throws IOException {
if (pos >= count) {
fill();
if (pos >= count)
return -1;
}
return getBufIfOpen()[pos++] & 0xff;
}