Search code examples
saxon

Optimal way to chain/pipeline transforms


I've done some research on saxonica.com, Google, and Chat GPT to find the optimal way to chain/pipeline transforms without incurring serialization and parsing. I found ways to do it with s9api, JAXP, and fn:transform(). I've found some articles implementing intermediate results as DOM, as SAX, and as XDM. Some of the articles I found are 14 years old, so I'm not sure which is ideal now in 2023. I'm executing Saxon within a Java Servlet running on Jetty, based on the Servlet example on saxonica.com, which caches the compiled Templates. I don't like fn:transform(), as I don't think it can use compiled stylesheets like I can in Java; if I got this wrong, let me know (I do have static parameters). Speed is my number one priority. It does not have to be portable, as I will only be using Saxon.

What is the modern recommendation?


Solution

  • You've left out the possibility of using a separate pipeline language such as XProc or Orbeon to orchestrate the workflow.

    You've also left out the possibility of doing a sequence of transformations within a single XSLT stylesheet using the pattern

    <xsl:variable name="phase-a-output">
      <xsl:apply-templates mode="phase-a"/>
    </xsl:variable>
    <xsl:variable name="phase-b-output">
      <xsl:apply-templates select="$phase-a-output" mode="phase-b"/>
    </xsl:variable>
    <xsl:apply-templates select="$phase-b-output" mode="phase-c"/>
    

    If you're comfortable coding in Java then I would recommend using the s9api API. Create three Xslt30Transformer instances A, B, and C, and then do

    A.applyTemplates(source,
       B.asDocumentDestination(
          C.asDocumentDestination(
             serializer)))
    

    You mention that you are using static parameters. You should be aware that static parameters are applied at the time the stylesheet is compiled, which means that if you use the same stylesheet repeatedly with different values for static parameters, it's going to be recompiled each time.

    I think you're right to be a little wary of fn:transform(), not so much for performance reasons as because it can be very hard to debug.