Search code examples
javaxmlmemoryaxiom

Is it possible to stream the value of an attribute using Axiom


I have an InputStream that contains an XML payload that contains nested layers of XML. The payload's can be huge and in order to use as little memory as possible I need to process everything as a stream. Unfortunatly The payload I am getting contains a deeply nested XML document that contains a tag that has an attribute whose value is another xml document.

<xml>
    <payload>&lt;xml&gt;&lt;another_payload value=&quot;&lt;xml&gt;&lt;xml/&gt;&quot;/&gt;&lt;xml/&gt;<payload/>
<xml/>

When I drill down into this payload you will notice we something that looks like this:

<another_payload value=&quot;&lt;xml&gt;&lt;xml/&gt;&quot;/>

Paying close attention you will notice that another_payload has an attribute called value which holds another large xml document.

The attribute can contain a gigantic XML document that I cannot load into memory. I need to stream it just like to ElementHelper::getTextAsStream

Before anyone asks, I have attempted negotiating to have the payload structured at the origin so that I can handle it better, but for one reason or another they will not.


Solution

  • Axiom uses the StAX API for XML parsing and StAX doesn't support streaming of long attribute values.

    More generally, even if Axiom had it's own XML parser, this would be tricky to support. Consider the following example:

    <some_element p:myattr="...long value..." xmlns:p="http://example.org"/>
    

    In this case, the parser can't resolve the namespace of the attribute before streaming the attribute value. Axiom would either have to support some form of lazy namespace resolution or the support for attribute value streaming would be limited to cases where the namespace can be resolved before the attribute value is processed (which would be the case for all unqualified attributes).