Search code examples
javaarraysbytearrayinputstream

Efficient ByteArrayInputStream manipulation


I am working with a ByteArrayInputStream that contains an XML document consisting of one element with a large base 64 encoded string as the content of the element. I need to remove the surrounding tags so I can decode the text and output it as a pdf document.

What is the most efficient way to do this?

My knee-jerk reaction is to read the stream into a byte array, find the end of the start tag, find the beginning of the end tag and then copy the middle part into another byte array; but this seems rather inefficient and the text I am working with can be large at times (128KB). I would like a way to do this without the extra byte arrays.


Solution

  • Do your search and conversion while you are reading the stream.

    // find the start tag
    byte[] startTag = new byte[]{'<', 't', 'a', 'g', '>'};
    int fnd = 0;
    int tmp = 0;
    while((tmp = stream.read()) != -1) {
     if(tmp == startTag[fnd]) 
      fnd++;
     else
      fnd=0;
     if(fnd == startTage.size()) break;
    }
    
    // get base64 bytes
    while(true) {
     int a = stream.read();
     int b = stream.read();
     int c = stream.read();
     int d = stream.read();
     byte o1,o2,o3; // output bytes
     if(a == -1 || a == '<') break;
     //
     ...
     outputStream.write(o1);
     outputStream.write(o2);
     outputStream.write(o3);
    }
    

    note The above was written in my web browser, so syntax errors may exist.