Search code examples
javabufferedinputstream

Why FileInputStream is much faster then BufferedInputStream with the same buffer size


Here are two pieces of code like this:

FileInputStream is = new FileInputStream(tmp);
byte[] buf = new byte[1024];
while (is.read(buf) > -1) {
}

and

BufferedInputStream is = new BufferedInputStream(new FileInputStream(tmp),1024);
while (is.read() > -1) {
}

It seems from BufferedInputStream source code that they will cost the same time, but actually the first way runs much faster (166ms vs 5159ms on a 200M file). Why?


Solution

  • FileInputStream#read(byte b[]) will read the multiple bytes into b every call. In this case 1024

    BufferedInputStream#read() will read one byte every call. Internally BufferedInputStream will use a buffer of size 1024 to copy data from the stream which it wraps, however, you are still performing far more operations than you have to.

    Try using the BufferedInputStream#read(byte b[]) method and you will notice comparable speeds to that of FileInputStream.

    Also as noted by OldCurmudgeon the BufferedInputStream#read method is synchronized:

    public synchronized int read() throws IOException {
        if (pos >= count) {
            fill();
            if (pos >= count)
                return -1;
        }
        return getBufIfOpen()[pos++] & 0xff;
    }
    

    To show you an example of how much overhead this can be, I made a small demo:

    public class Main {
        static final double TEST_SIZE = 100000000.0;
        static final double BILLION = 1000000000.0;
    
        public static void main(String[] args) {
            testStandard();
            testSync();
        }
    
        static void testStandard() {
            long startTime = System.nanoTime();
            for (int i =0; i < TEST_SIZE; i++) {
            }
            long endTime = System.nanoTime();
            System.out.println((endTime - startTime)/ BILLION  + " seconds");
        }
    
        static void testSync() {
            long startTime = System.nanoTime();
            for (int i =0; i < TEST_SIZE; i++) {
                synchronized (Main.class) {}
            }
            long endTime = System.nanoTime();
            System.out.println((endTime - startTime)/ BILLION  + " seconds");
        }
    }
    

    On my computer the synchronized calls took around 40 times longer to execute:

    0.13086644 seconds
    4.90248797 seconds