Search code examples
xsltxslt-2.0saxon

I want to transform a CDATA under an element to an attribute value and keep the new lines


I want to transform an XML file to another XML file using XSLT, and replace the CDATA of some XML elements by an attribute.

I also need to keep the new line characters, of course I need to convert those I find in the CDATA content to " ".

For example, if I have:

<description>
  This is a description on
  more than one line
</description>

I want to have the following result:

<description value="This is a description on&#xA;   more than one line"/>

I wrote the following XSL file to do this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
  <xsl:variable name="new-line" select="'&#10;'" /> 

  <xsl:template name="writeParam" match="/">
    <xsl:choose>
       <xsl:when test="description">      
          <xsl:element name="description">
             <xsl:variable name="temp">
                <xsl:value-of select="description"/>
             </xsl:variable>    
             <xsl:attribute name="value">            
                <xsl:value-of select="translate($temp, $new-line, '&#xA;')" />
             </xsl:attribute>               
          </xsl:element>
       </xsl:when>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>

But instead I have the following result:

<description value="&#10;   This is a description on&#10;   more than one line&#10;"/>

It seems that the translate function is never called, why is it not working?


Solution

  • The requirements you state are conflicting:

    You say you want to keep the newline characters, but the expected result you show does not have the original leading and trailing whitespace characters, including the line breaks.

    To get the result you show you need to do something like (in XSLT 2.0):

    <xsl:template match="description"> 
        <xsl:copy>
            <xsl:attribute name="value" select="replace(., '^\s+|\s+$', '')"/>
        </xsl:copy>
    </xsl:template> 
    

    Note that this keeps not only the line breaks between the lines, but also the indenting spaces. If you want to get rid of those too, there's more work to be done, maybe something like:

            <xsl:attribute name="value" select="replace(replace(., '^\s+|\s+$', ''), '&#10; +', '&#10;')"/>