Search code examples
javaniobufferedreaderbytebuffer

BufferedReader for large ByteBuffer?


Is there a way to read a ByteBuffer with a BufferedReader without having to turn it into a String first? I want to read through a fairly large ByteBuffer as lines of text and for performance reasons I want to avoid writing it to the disk. Calling toString on the ByteBuffer doesn't work because the resulting String is too large (it throws java.lang.OutOfMemoryError: Java heap space). I would have thought there would be something in the API to wrap a ByteBuffer in a suitable reader, but I can't seem to find anything suitable.

Here's an abbreviated code sample the illustrates what I am doing):

// input stream is from Process getInputStream()
public String read(InputStream istream)
{
  ReadableByteChannel source = Channels.newChannel(istream);
  ByteArrayOutputStream ostream = new ByteArrayOutputStream(bufferSize);
  WritableByteChannel destination = Channels.newChannel(ostream);
  ByteBuffer buffer = ByteBuffer.allocateDirect(writeBufferSize);

  while (source.read(buffer) != -1)
  {
    buffer.flip();
    while (buffer.hasRemaining())
    {
      destination.write(buffer);
    }
    buffer.clear();
  }

  // this data can be up to 150 MB.. won't fit in a String.
  result = ostream.toString();
  source.close();
  destination.close();
  return result;
}

// after the process is run, we call this method with the String
public void readLines(String text)
{
  BufferedReader reader = new BufferedReader(new StringReader(text));
  String line;

  while ((line = reader.readLine()) != null)
  {
    // do stuff with line
  }
}

Solution

  • It's not clear why you're using a byte buffer to start with. If you've got an InputStream and you want to read lines for it, why don't you just use an InputStreamReader wrapped in a BufferedReader? What's the benefit in getting NIO involved?

    Calling toString() on a ByteArrayOutputStream sounds like a bad idea to me even if you had the space for it: better to get it as a byte array and wrap it in a ByteArrayInputStream and then an InputStreamReader, if you really have to have a ByteArrayOutputStream. If you really want to call toString(), at least use the overload which takes the name of the character encoding to use - otherwise it'll use the system default, which probably isn't what you want.

    EDIT: Okay, so you really want to use NIO. You're still writing to a ByteArrayOutputStream eventually, so you'll end up with a BAOS with the data in it. If you want to avoid making a copy of that data, you'll need to derive from ByteArrayOutputStream, for instance like this:

    public class ReadableByteArrayOutputStream extends ByteArrayOutputStream
    {
        /**
         * Converts the data in the current stream into a ByteArrayInputStream.
         * The resulting stream wraps the existing byte array directly;
         * further writes to this output stream will result in unpredictable
         * behavior.
         */
        public InputStream toInputStream()
        {
            return new ByteArrayInputStream(array, 0, count);
        }
    }
    

    Then you can create the input stream, wrap it in an InputStreamReader, wrap that in a BufferedReader, and you're away.