I'm using an XSLT 3 template with <xsl:output method="text"/>
, which extracts some lines of text from an XML source document. The template is very particular, producing the individual lines and even the newlines (LF
) in the right places.
Invoking the Saxon HE 12.2 JAR with Java 17 from the command line, I verify that the output text is precisely what I'm looking for, suitable for a .txt
file.
The next step is to do the same thing programmatically, so I followed the documentation for using the s9api for transformations. Since I had used <xsl:output method="text"/>
I assumed that an XSLT processor would output only text. Instead it appears that transformer.applyTemplates(new StreamSource(xmlInputStream))
will produce an XdmValue
, itself which is a series of XdmItem
s.
Investigating further, it seems that each XdmItem
wraps an XdrNode
of kind TEXT
! (I see that this mirrors the DOM's text nodes.) There is a text node for each output of the stylesheet, including a separate node for each newline which the output, e.g. from <xsl:text> </xsl:text>
in the template.
As I mentioned I had assumed that <xsl:output method="text"/>
would have made the transformer skip the XML world altogether and simply output the text to a text buffer. I imagined some sort of produceText(String)
method, similar to Hadoop MapReduce emitting values, which would be collected immediately to a buffer without the need to wrap them each in any sort of node. But I guess the XML foundation still presents itself to some extent, even in "text" output mode.
To me these nodes seem like needless overhead, as <xsl:output method="text"/>
plainly indicates I don't need XML output at all. Maybe for historical reasons it's unavoidable. In any case, I understand that I can extract the text using this:
String text = xdmValue.stream().map(XdmItem::getStringValue).collect(joining());
My question is simply: is this the most efficient way to extract XSLT text output using Saxon, or is there a simpler, more direct way that skips the intermediate overhead of XdmNode
items?
There is an overload of the applyTemplates
method (https://www.saxonica.com/html/documentation12/javadoc/net/sf/saxon/s9api/Xslt30Transformer.html#applyTemplates(net.sf.saxon.s9api.XdmValue,net.sf.saxon.s9api.Destination) writing to a destination like a Serializer
(over a stream or file or writer ) that I would suggest to use if you want Saxon to serialize the transformation result based on your xsl:output
declarations.