I'm using Axiom in Axis2 to extract the text from a large base64Binary section of the SOAP message. My receiver is not using MTOM, and uses OMElement.getTextAsStream( false )
to extract the text. The code looks something like this:
final Iterator<OMElement> childrenIterator = uploadFile.getChildElements();
while ( childrenIterator.hasNext() )
{
final OMElement element = childrenIterator.next();
if ( "fileID".equals( element.getLocalName() ) )
{
fileID = element.getText();
}
// fileContent contains a large base64Binary block
else if ( "fileContent".equals( element.getLocalName() ) )
{
Reader reader = element.getTextAsStream( false );
final char[] buf = new char[BUFFER_SIZE];
int len = 0;
while ( (len = reader.read( buf ) ) >= 0 )
{
if ( len > 0 )
{
// Process chunk here
}
}
}
}
A sample XML would look like
<uploadFile>
<fileID>id</fileID>
<fileContent>~500kB of base64 data</fileContent>
</uploadFile>
I'm getting this exception on the childrenIterator.hasNext()
line after the base64Binary data has been read:
Caused by: org.apache.axiom.om.OMException: Parser has already reached end of the document. No siblings found
at org.apache.axiom.om.impl.llom.OMElementImpl.getNextOMSibling(OMElementImpl.java:359)
at org.apache.axiom.om.impl.traverse.OMChildrenIterator.getNextNode(OMChildrenIterator.java:36)
at org.apache.axiom.om.impl.traverse.OMAbstractIterator.hasNext(OMAbstractIterator.java:69)
at org.apache.axiom.om.impl.traverse.OMFilterIterator.hasNext(OMFilterIterator.java:54)
I've done some investigating, and it's definitely related to the fact that I'm setting cache to false
when calling getTextAsStream()
. I need to do this because the potential size of the base64 data could be hundreds of megabytes.
The problem seems to be that TextFromElementReader
advances the underlying XMLStreamReader
to the END_ELEMENT event. OMElementImpl.getNextOMSibling()
then calls next()
on the underlying XMLStreamReader
and gets the END_DOCUMENT event. It seems like the TextFromElementReader
needs to encounter the END_ELEMENT to know that it has reached the end of the text segment, but this leaves the underlying XMLStreamReader
in the wrong state for OMElementImpl.getNextOMSibling()
.
Has anyone seen this error before? Is it something wrong with the way I'm using Axiom?
I ended up not using getTextAsReader at all. Instead, I iterated through the child text nodes and processed the text content in chunks that way. The parser is configured to be non-coalescing, so I get reasonably sized text nodes rather than one big one.
OMNode child = omElement.getFirstOMChild();
while ( child != null )
{
if ( child instanceof OMText )
{
// process 'child' text here
final OMNode nextSibling = child.getNextOMSibling();
child.detach(); // detach from OM to keep memory usage low
child = nextSibling;
}
}