Search code examples
htmlxmlxsltxml-namespacesline-breaks

XSLT adding extra <br> tag when "copy-of" is used


I am transforming this XML:

<?xml version='1.0' encoding='utf-8'?>
<song xmlns="http://openlyrics.info/namespace/2009/song" version="0.8" createdIn="OpenLP 2.0.1" modifiedIn="OpenLP 2.0.1" modifiedDate="2012-03-14T02:21:52">
  <properties>
    <titles>
      <title>Amazing Grace</title>
    </titles>
    <authors>
      <author>John Newton</author>
    </authors>
  </properties>
  <lyrics>
    <verse name="v1">
      <lines>Amazing grace, how sweet the sound<br/>That saved a wretch like me<br/>I once was lost, but now am found<br/>Was blind but now I see</lines>
    </verse>

with this XSLT:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:o="http://openlyrics.info/namespace/2009/song">
<xsl:output method="html" />
<xsl:template match="o:song">
<html>
<body>
<h2><xsl:value-of select="o:properties/o:titles/o:title"/><br /></h2>
<h3><xsl:for-each select="o:properties/o:authors/o:author">
  <xsl:value-of select="."/><br />
  </xsl:for-each></h3>
  <xsl:for-each select="o:lyrics/o:verse/o:lines">
  <xsl:copy-of select="." />
  </xsl:for-each>
</body>
</html>
</xsl:template>

Each part of the "lines" has a br/ tag I want to keep, which is why I'm using copy-of instead of value-of. This is what I'm trying to get from "lines":

<lines xmlns="http://openlyrics.info/namespace/2009/song">Amazing grace, how sweet the sound<br/>That saved a wretch like me

Instead, the XSLT changes the br/ tag into

<br></br>

and yields two line breaks instead of just the one I want.

<lines xmlns="http://openlyrics.info/namespace/2009/song">Amazing grace, how sweet the sound<br></br>

Is there a way to prevent this br/ transformation and preserve the br/ tags present in the XML while still using copy-of? What am I doing wrong here?

Note: Strangely, when I delete the namespace in the "song" tag and write the XSLT as if there were no namespaces, such as

  <xsl:for-each select="lyrics/verse/lines">
  <xsl:copy-of select="." />

then I don't have this problem. The br/ tags are preserved. So I feel like it has something to do with using the namespace.

Any ideas?


Solution

  • The problem is not really with XSLT, but with the result document from XSLT is serialised to a string.

    You have specified the output method of "HTML" in the XSLT, which is used by the serialiser, but not everything you output is actually HTML. Lines is not an HTML element, and furthermore, because it has a namespace, all the child br elements are currently part of the namespace, and so these are probably not seen as HTML elements either.

    When you remove the namespace, the br are then seen as HTML elements, and you get your expected behavior, because the serialisation process knows what to do with them.

    One way to get around this, is to ensure the br elements are output without a namespace. Instead of using xsl:for-each for the lines, use xsl:apply-templates

    <xsl:apply-templates select="o:lyrics/o:verse/o:lines" />
    

    Then you would have templates matching lines and br. In particular, the template matching br would output the element without a namespace.

    Try this XSLT:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:o="http://openlyrics.info/namespace/2009/song" exclude-result-prefixes="o">
       <xsl:output method="html"/>
    
       <xsl:template match="o:song">
          <html>
             <body>
                <h2>
                   <xsl:value-of select="o:properties/o:titles/o:title"/>
                   <br/>
                </h2>
                <h3>
                   <xsl:for-each select="o:properties/o:authors/o:author">
                      <xsl:value-of select="."/>
                      <br/>
                   </xsl:for-each>
                </h3>
                <xsl:apply-templates select="o:lyrics/o:verse/o:lines"/>
             </body>
          </html>
       </xsl:template>
    
       <xsl:template match="o:lines">
          <xsl:copy>
             <xsl:apply-templates/>
          </xsl:copy>
       </xsl:template>
    
       <xsl:template match="o:br">
          <br/>
       </xsl:template>
    </xsl:stylesheet>
    

    This should output the following

    <html>
    <body>
    <h2>Amazing Grace<br></h2>
    <h3>John Newton<br></h3>
    <lines xmlns="http://openlyrics.info/namespace/2009/song">
       Amazing grace, how sweet the sound<br xmlns="">
       That saved a wretch like me<br xmlns="">
       I once was lost, but now am found<br xmlns="">
       Was blind but now I see
    </lines>
    </body>
    </html>
    

    This should at least give you single line-breaks when viewed in a browser.