Here are two pieces of code like this:
FileInputStream is = new FileInputStream(tmp);
byte[] buf = new byte[1024];
while (is.read(buf) > -1) {
}
and
BufferedInputStream is = new BufferedInputStream(new FileInputStream(tmp),1024);
while (is.read() > -1) {
}
It seems from BufferedInputStream
source code that they will cost the same time, but actually the first way runs much faster (166ms vs 5159ms on a 200M file). Why?
FileInputStream#read(byte b[])
will read the multiple bytes into b
every call. In this case 1024
BufferedInputStream#read()
will read one byte every call. Internally BufferedInputStream
will use a buffer of size 1024
to copy data from the stream which it wraps, however, you are still performing far more operations than you have to.
Try using the BufferedInputStream#read(byte b[])
method and you will notice comparable speeds to that of FileInputStream
.
Also as noted by OldCurmudgeon the BufferedInputStream#read
method is synchronized:
public synchronized int read() throws IOException {
if (pos >= count) {
fill();
if (pos >= count)
return -1;
}
return getBufIfOpen()[pos++] & 0xff;
}
To show you an example of how much overhead this can be, I made a small demo:
public class Main {
static final double TEST_SIZE = 100000000.0;
static final double BILLION = 1000000000.0;
public static void main(String[] args) {
testStandard();
testSync();
}
static void testStandard() {
long startTime = System.nanoTime();
for (int i =0; i < TEST_SIZE; i++) {
}
long endTime = System.nanoTime();
System.out.println((endTime - startTime)/ BILLION + " seconds");
}
static void testSync() {
long startTime = System.nanoTime();
for (int i =0; i < TEST_SIZE; i++) {
synchronized (Main.class) {}
}
long endTime = System.nanoTime();
System.out.println((endTime - startTime)/ BILLION + " seconds");
}
}
On my computer the synchronized calls took around 40 times longer to execute:
0.13086644 seconds
4.90248797 seconds