Search code examples
xslt-2.0

XSLT merging two files with different namespaces


This is my master HTML file with predefined namespace:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <title>some title</title>
    </head>
    <body>
        <p>some text</p>
    </body>
</html>

And I have an additional XML file defined like this:

<?xml version="1.0" encoding="UTF-8"?>
<article dtd-version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
    <front>
        <element>front text</element>
    </front>
    <back>
        <extra-list>
            <element>element text</element>
        </extra-list>
    </back>
</article>

This is wanted final output (head from html file, extra-list from xml file):

<?xml version="1.0" encoding="UTF-8"?>
<xml>
  <head>
    <title>some title</title>
  </head>
  <back>
    <extra-list>
      <element>element text</element>
    </extra-list>
  </back>
</xml>

I am trying to join these two files with this XSLT below:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xpath-default-namespace="http://www.w3.org/1999/xhtml"
  version="2.0">
  
  <xsl:output method="xml" version="1.0" indent="yes"/>
  
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="html">
    <xml>
      <xsl:apply-templates/>
    </xml>
  </xsl:template>
  
  <xsl:template match="head">
    <head>
      <xsl:apply-templates/>
    </head>
  </xsl:template>
  
  <xsl:template match="body">
    <back>
      <xsl:copy-of select="document('doc.xml')"/>       
    </back>
  </xsl:template>
  
</xsl:transform>

I use xpath-default-namespace in XSLT so I don't have to address HTML's namespace all the time (the original master HTML is huge) and I would like to stay with this parameter if possible. Here I am having two issues:

1.) How is it possible to get rid of all xmlns declarations on output?

2.) It is only possible to copy the whole xml file with this command <xsl:copy-of select="document('doc.xml')"/>. If I try to copy only subelement <xsl:copy-of select="document('doc.xml')/article/back"/>, then I get no output, because the content is not in the same namespace. How would I be able to solve this?

UPDATE (COMPLETE XSLT SOLUTION):

Based on Martin's answer below, this is fully working solution.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xpath-default-namespace="http://www.w3.org/1999/xhtml"
  version="2.0">
  
  <xsl:output method="xml" version="1.0" indent="yes"/>
  
  <!-- copy all elements and ignore namespace -->
  <xsl:template match="*">
    <xsl:element name="{local-name()}">
      <xsl:apply-templates select="@* | node()"/>
    </xsl:element>
  </xsl:template>
  
  <!-- copy all attributes and ignore namespace -->
  <xsl:template match="@*">
    <xsl:attribute name="{local-name()}">
      <xsl:value-of select="."/>
    </xsl:attribute>
  </xsl:template>
  
  <!-- copy all remaining nodes and ignore namespace -->
  <xsl:template match="comment() | text() | processing-instruction()">
    <xsl:copy/>
  </xsl:template>
  
  <xsl:template match="html">
    <xml>
      <xsl:apply-templates/>
    </xml>
  </xsl:template>
  
  <xsl:template match="head">
    <head>
      <xsl:apply-templates/>
    </head>
  </xsl:template>
  
  <xsl:template match="body">
    <xsl:copy-of xpath-default-namespace="" copy-namespaces="no" select="document('doc.xml')/article/back"/>    
  </xsl:template>
  
</xsl:transform>

I also added two extra templates to copy attributes and some other nodes.


Solution

  • You can override xpath-default-namespace were needed e.g. <xsl:copy-of xpath-default-namespace="" select="document('doc.xml')/article/back"/>.

    As for namespaces, there are several issues. You run part of the input in the XHTML namespace through an identity transformation, this always preserves the namespace of the elements copied. You will need to change from the identity transformation to a transformation stripping the namespace from elements:

      <xsl:template match="*">
          <xsl:element name="{local-name()}">
              <xsl:apply-templates select="@* | node()"/>
          </xsl:element>
      </xsl:template>
    

    The literal result elements you create in the XSLT have the XLink namespace in scope as you declare but not use it in the XSLT code. Either remove the declaration or use exclude-result-prefixes="xlink" on the xsl:stylesheet or xsl:transform element.

    The other input you access with document('doc.xml') also declares unused namespaces, the default copying preserves them but as they are only in scope but not used you can get rid of them with copy-namespaces="no: <xsl:copy-of xpath-default-namespace="" select="document('doc.xml')/article/back" copy-namespaces="no"/>. Or you would need to push those elements as well through the template stripping namespace with xsl:element name="{local-name()}".