Search code examples
xmlxsltoxygenxmltei

Unwanted new attributes in XML-output after XSLT


I want to transform a XML file using XSLT. During the transformation, there are new attributes added to the output file which I can't get my head around.

Input XML file (abbr.):

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title/>
            </titleStmt>
            <publicationStmt>
                <publisher/>
            </publicationStmt>
            <sourceDesc>
                <p/>
            </sourceDesc>
        </fileDesc>
    </teiHeader>
    <text>
        <body>
            <pb n="1"/>
            <p xml:id="uuid_f770d0a9-277a-42d5-a759-a02b05a29a49">
                <lb xml:id="uuid_cf21b2a4-e5b2-4ced-a1b2-a4e5b29ced0a"/>This is 
                <lb xml:id="uuid_dffd4def-3f7a-4d5b-bd4d-ef3f7abd5bf9"/> <rs type="person">an example</rs>.
            </p>
        </body>
    </text>
</TEI>

XSLT stylesheet to remove unwanted whitespace between <lb/> and <rs> (2nd line in input file above) used with Saxon PE 9.9.1.7:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:tei="http://www.tei-c.org/ns/1.0"
  exclude-result-prefixes="xs"
  version="2.0">
  
  <xsl:output method="xml" version="1.0" indent="no"/>
    
    <xsl:template match="@* | node()" name="identity-copy">
      <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
      </xsl:copy> 
    </xsl:template>
    
    <!-- lb followed by rs -->
    <xsl:template match="text()[preceding-sibling::*[1][self::*:lb[@*]]][following-sibling::*[1][self::*:rs[@*]]][string-length(normalize-space()) = 0]"/>
  
</xsl:stylesheet>

In the outputted XML file there are new attributes added to elements that I don't even address in my stylesheet. E. g. there is @status='draft' added to <revisionDesc>/<change> (not in the input example) and @part='N' to <p>. I could list more examples, but I think it's a general problem. How can I avoid this?

Thanks in advance!


Solution

  • If you're using Saxon, there is an option to suppress expansion of schema- or DTD-defined default attribute values. (Though it doesn't work with all XML parsers, some don't have this option). In the Oxygen "configure transformation scenario" dialog, it's shown with a checkbox 'Expand attribute defaults ("-expand")'.