Search code examples
c#xmlxsltxslcompiledtransform

XslCompiledTransform strips new line characters inside CDATA without replacing them


I have source xml with this data:

<add>
<doc>
<field name="Body"><![CDATA[Line break 1\r\n\r\nline break 2\r\n\r\nline break 3\r\n\r\n Some more text.]]>
</field>
</add>

I am using XslCompiledTransform to transform it with this xslt:

<xsl:template match="add">
    <add>
        <xsl:for-each select="doc">
            <doc>
                <xsl:apply-templates select="@* | node()" />
            </doc>
        </xsl:for-each>
    </add>
</xsl:template>

So the Body field should just pass through unchanged. The C# code to perform the transformation is as follows:

XmlDocument source = new XmlDocument();
StringReader reader = new StringReader("My source xml comes in here");
source.Load(reader);

XslCompiledTransform transformer = new XslCompiledTransform(false);
transformer.Load("xslt Path");

XmlWriterSettings settings = transformer.OutputSettings.Clone();
settings.NewLineHandling = NewLineHandling.Replace;
settings.NewLineChars = "\r\n";

StringBuilder builder = new StringBuilder();

using (XmlWriter writer = XmlWriter.Create(builder, settings))
{
    transformer.Transform(source, this.xsltArgs, writer);
}

string transformedXml = builder.ToString();

The result of this transformation is:

<?xml version="1.0" encoding="utf-16"?>
<add>
<doc>
<field name="Body">Line break 1 line break 2 line break 3 Some more text.</field>
</doc>
</add>

As you can see, both the CDATA and line breaks have been removed. It's ok to remove the CDATAs at this stage, but I need to retain the line breaks. No matter which combination of the NewLineHandling and NewLineChars settings I use (or, indeed, if I omit them altogether), I get the same result.

Is there something else that I need to be doing?


Solution

  • This was resolved by changing the xslt file, thus:

    <xsl:template match="add">
        <add>
            <xsl:for-each select="doc">
                <doc>
                    <xsl:copy-of select="field[@name = 'Body']" />
                    <xsl:apply-templates select="@* | node()" />
                </doc>
            </xsl:for-each>
        </add>
    </xsl:template>
    
    <xsl:template match="field[@name = 'Body']" />
    
    <xsl:template match="field[@name = 'Source']">
        <field>
            <xsl:attribute name="name">Source</xsl:attribute>
            <xsl:value-of select="normalize-space(.)"/>
        </field>
    </xsl:template>
    
    <xsl:template match="field[@name = 'Section']">
        <field>
            <xsl:attribute name="name">Section</xsl:attribute>
            BikesForSale
        </field>
    </xsl:template>
    
    <xsl:template match="field[@name = 'FirstSeen']">
        <field>
            <xsl:attribute name="name">PublishedDate</xsl:attribute>
            <xsl:value-of select="."/>
        </field>
    </xsl:template>
    
    ...more here, removed for brevity.