Search code examples
javascalainputstreambufferedreader

How to reading Java (Using Java lib in Scala) InputStream efficiently?


My Scala server gets InputStream object from socket by socket.getInputStream (some bytes sent from my socket client, the size of bytes is printed below)

And following code tries to read it to Array

  var buffer: Array[Byte] = null
  def read(stream: InputStream, size: Int) = {
    val start = System.nanoTime()
    buffer = new Array[Byte](size)
    var value: Int = 0
    (0 until size).foreach(i => {
      value = stream.read()
      buffer(i) = value.toByte
    })
    val end = System.nanoTime()
    println(s"Getting buffer from InputStream, size: $size, cost: ${(end - start)/1e6} ms")
    buffer
  }

Part of output is

Getting buffer from InputStream, size: 4, cost: 174.923596 ms
Getting buffer from InputStream, size: 2408728, cost: 919.207885 ms

However, for the same data size, some existed server could be much faster, e.g. Redis could send the bytes in ~10ms, so

Is it possible to improve the performance in this Program?


Solution

  • stream.read() is the slowest take on the concept.

    Instead you want the read(byte[]) variant, or the read(byte[], int offset, int length) variant (one is just a very simple, and performance-wise essentially free, wrapper around the 3-param method).

    The 'overhead' of using read() ranges from 'slight' (in case buffers are involved) to 'a factor 1000x' in case there aren't. If it's the second, you can get back to the 'slight' overhead by wrapping your inputstream in a BufferedInputStream and read from that.

    But no matter what happens, this:

    int toRead = 1000;
    byte[] data = new byte[toRead];
    int readSoFar = 0;
    while (readSoFar < toRead) {
      int read = in.read(data, readSoFar, toRead - readSoFar);
      if (read == -1) throw new IOException("Expected more data");
      toRead += read;
    }
    

    is far faster than:

    int toRead = 1000;
    byte[] data = new byte[toRead];
    while (toRead > 0) {
      data[toRead--] = in.read();
    }
    

    usage of scala makes no difference in performance for these examples.