Search code examples
xmlxpathxslt

How can I push an element into XML using xpath() in XSLT 3?


I have large XML files and need to inject processing instructions into them. The locations for the processing instructions are listed as xpath locations. I created a small version of the XML file to give a complete example here.

This is a small sample XML file:

<?xml version="1.0" encoding="UTF-8"?>
<ACT>
<TITLE>
    <P>Some title</P>
</TITLE>
<LIST>
    <ITEM>
        <LABEL>Article 1</LABEL>
        <P>Dummy text for article 1.</P>
    </ITEM>
    <ITEM>
        <LABEL>Article 2</LABEL>
        <P>Dummy text for article 2.</P>
        <LIST>
            <ITEM>
                <LABEL>Article a</LABEL>
                <P>Dummy text for article a.</P>
            </ITEM>
            <ITEM>
                <LABEL>Article b</LABEL>
                <P>Dummy text for article b.</P>
            </ITEM>
            <ITEM>
                <LABEL>Article c</LABEL>
                <P>Dummy text for article c.</P>
            </ITEM>
        </LIST>
    </ITEM>
    <ITEM>
        <LABEL>Article 3</LABEL>
        <P>Dummy text for article 3.</P>
    </ITEM>
</LIST>

The file with xpath locations looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<report>
    <page_break>
        <description>NO 1</description>
        <xpath_location>/ACT[1]/LIST[1]/ITEM[1]/P[1]</xpath_location>
    </page_break>
    <page_break>
        <description>NO 2</description>
        <xpath_location>/ACT[1]/LIST[1]/ITEM[2]/LIST[1]/ITEM[3]/P[1]</xpath_location>
    </page_break>
</report>

I tried to use the XSLT 3 xpath() function to match the position of the current element against the xpath_location elements in my PageBreaks.xml file, but I am not getting any matches. When I put the xpath() of the current element into a message it includes the Q{} in every branch. But when I added those into the file with target locations that did not give any results, either.

Here is the XSL I have tried:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="3.0">

<xsl:output method="xml" encoding="UTF-8"/>

<xsl:variable name="breaks" as="node()" select="document('Breaks.xml')/report"/>

<xsl:template match="/">
    <xsl:apply-templates/>
</xsl:template>

<xsl:template match="@*|text()">
    <xsl:copy>
        <xsl:apply-templates select="@*, node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="*">
    <xsl:variable name="mypath">
        <xsl:value-of select="./path()"/>
    </xsl:variable>
    <xsl:variable name="pi_insert">
        <xsl:if test="$breaks//xpath_location[. eq $mypath]">yes</xsl:if>
    </xsl:variable>
    <xsl:copy>
        <xsl:apply-templates select="@*, node()"/>
    </xsl:copy>
    <xsl:if test="$pi_insert eq 'yes'">
        <xsl:processing-instruction name="PAGE">
            <xsl:value-of select="$breaks//xpath_location[. eq $mypath]/preceding-sibling::description"/>
        </xsl:processing-instruction>
    </xsl:if>
</xsl:template>

</xsl:stylesheet>

Solution

  • Doing a string match on the result of path() seems not very robust: it only takes a very slight variation in the way the path is written in the XML document (for example, some whitespace) and then it won't match. I think it would be more reliable to use xsl:evaluate.

    Start by building a map from selected nodes to their descriptions:

    <xsl:variable name="map" as="map(xs:string, xs:string)">
      <xsl:variable name="root" select="."/>
      <xsl:map>
        <xsl:for-each select="$breaks//page-break">
          <xsl:variable name="selectedNode" as="element(*)">
            <xsl:evaluate xpath="xpath_location"
                          context-item="$root"/>
          </xsl:variable>
          <xsl:map-entry key="{generate-id($selectedNode)}" select="description"/>
        </xsl:for-each>
      </xsl:map>
    </xsl:variable>
    

    and then use this map to expand the relevant nodes:

    <xsl:template match="*[exists($map(generate-id(.)))]">
        <xsl:processing-instruction name="PAGE">
            <xsl:value-of select="$map(generate-id(.))"/>
        </xsl:processing-instruction>
    </xsl:template>