I am working with a ByteArrayInputStream
that contains an XML document consisting of one element with a large base 64 encoded string as the content of the element. I need to remove the surrounding tags so I can decode the text and output it as a pdf document.
What is the most efficient way to do this?
My knee-jerk reaction is to read the stream into a byte
array, find the end of the start tag, find the beginning of the end tag and then copy the middle part into another byte
array; but this seems rather inefficient and the text I am working with can be large at times (128KB). I would like a way to do this without the extra byte
arrays.
Do your search and conversion while you are reading the stream.
// find the start tag
byte[] startTag = new byte[]{'<', 't', 'a', 'g', '>'};
int fnd = 0;
int tmp = 0;
while((tmp = stream.read()) != -1) {
if(tmp == startTag[fnd])
fnd++;
else
fnd=0;
if(fnd == startTage.size()) break;
}
// get base64 bytes
while(true) {
int a = stream.read();
int b = stream.read();
int c = stream.read();
int d = stream.read();
byte o1,o2,o3; // output bytes
if(a == -1 || a == '<') break;
//
...
outputStream.write(o1);
outputStream.write(o2);
outputStream.write(o3);
}
note The above was written in my web browser, so syntax errors may exist.