Search code examples
xmlxsltxslkey

XSLT: replacing a node by equivalent node of other document


I want to replace some nodes of an XML file by the equivalent nodes of another XML file. As this wouldn't be challenging enough, I want the ID used for comparision be the value of some child.

The "old" XML looks like:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Documents>
        <Document id="001">
            <Tags>
                <Tag id="document_id">someIDfilename.pdf</Tag>
                <Tag id="document_type">Type A</Tag>
                <Tag id="document_text">A very important document of course.</Tag>
            <Tags>
        </Document>
        <Document id="018">
            <Tags>
                <Tag id="document_id">someOtherIDfilename.pdf</Tag>
                <Tag id="document_type">Type B</Tag>
                <Tag id="document_text">Another very important document.</Tag>
            <Tags>
        </Document>
    </Documents>
</Root>

The second Docoument shall be replaced by the quivalent of the following XML, whereby the ID that I have to use is the value of document_id (since the "id" of the Document node sometimes is overwritten or altered) :

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Documents>
        <Document id="014">
            <Tags>
                <Tag id="document_id">someOtherIDfilename.pdf</Tag>
                <Tag id="document_type">Type B</Tag>
                <Tag id="document_text">The oh so important new document text.</Tag>
            <Tags>
        </Document>
    </Documents>
</Root>

The result is expected to look like:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <Documents>
        <Document id="001">
            <Tags>
                <Tag id="document_id">someIDfilename.pdf</Tag>
                <Tag id="document_type">Type A</Tag>
                <Tag id="document_text">A very important document of course.</Tag>
            <Tags>
        </Document>
        <Document id="018">
            <Tags>
                <Tag id="document_id">someOtherIDfilename.pdf</Tag>
                <Tag id="document_type">Type B</Tag>
                <Tag id="document_text">The oh so important new document text.</Tag>
            <Tags>
        </Document>
    </Documents>
</Root>

Q1: is that possible by means of XSLT? Or do I have to use Java / DOM?

Q2: If Q1==yes : can somebody solve that here?

Best! Philipp


Solution

  • Using an XSLT 2.0 processor like Saxon 9:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    
    <xsl:param name="doc2-url" select="'test2014032603.xml'"/>
    <xsl:variable name="doc2" select="doc($doc2-url)"/>
    
    <xsl:key name="id" match="Document" use="Tags/Tag[@id = 'document_id']"/>
    
    <xsl:template match="@* | node()">
      <xsl:copy>
        <xsl:apply-templates select="@* , node()"/>
      </xsl:copy>
    </xsl:template>
    
    <xsl:template match="Document[key('id', Tags/Tag[@id = 'document_id'], $doc2)]">
      <xsl:copy>
        <xsl:copy-of select="@id, key('id', Tags/Tag[@id = 'document_id'], $doc2)/node()"/>
      </xsl:copy>
    </xsl:template>
    
    </xsl:stylesheet>
    

    With XSLT 1.0 it is also possible but to switch contexts between documents for the key use the code ends up being convoluted:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      version="1.0">
    
    <xsl:param name="doc2-url" select="'test2014032603.xml'"/>
    <xsl:variable name="doc2" select="document($doc2-url)"/>
    
    <xsl:key name="id" match="Document" use="Tags/Tag[@id = 'document_id']"/>
    
    <xsl:template match="@* | node()" name="identity">
      <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
      </xsl:copy>
    </xsl:template>
    
    <xsl:template match="Document[Tags/Tag[@id = 'document_id']]">
      <xsl:variable name="this" select="."/>
      <xsl:for-each select="$doc2">
        <xsl:choose>
          <xsl:when test="key('id', $this/Tags/Tag[@id = 'document_id'])">
            <xsl:for-each select="key('id', $this/Tags/Tag[@id = 'document_id'])">
              <xsl:copy>
                <xsl:copy-of select="$this/@id"/>
                <xsl:copy-of select="node()"/>            
              </xsl:copy>
            </xsl:for-each>
          </xsl:when>
          <xsl:otherwise>
            <xsl:copy-of select="$this"/>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:for-each>
    </xsl:template>
    
    </xsl:stylesheet>