Search code examples
c#xmlxsltxmlreaderxslcompiledtransform

Preserving whitespace within XML elements between attributes when using XslCompiledTransform


I am applying an XSL-T file xsltUri to an XML file TargetXmlFile using the XslCompiledTransform class:

XslCompiledTransform xslTransform = new XslCompiledTransform(false);
xslTransform.Load(xsltUri);

using (var outStream = new MemoryStream())
{
    var writer = new StreamWriter(outStream, new UTF8Encoding());
    using (var reader = new XmlTextReader(TargetXmlFileName)
    {
        WhitespaceHandling = WhitespaceHandling.All,
        DtdProcessing = DtdProcessing.Ignore
    })
    {
        xslTransform.Transform(reader, xsltArguments, writer);
    }

    outStream.Position = 0;
    using (FileStream outFile = new FileStream(outputFileName, FileMode.Create))
    {
        outStream.CopyTo(outFile);
    }
}

Input XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <element
    id="1"
    attr1="value11"
    attr2="value12"/>
  <element    id="2"    attr1="value21"    attr2="value22"/>
</root>

Input XSL:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

  <xsl:template match="//element[@id='2']/@attr1">
    <xsl:attribute name="attr1">
      <xsl:value-of select="'newvalue21'"/>
    </xsl:attribute>
  </xsl:template>
</xsl:stylesheet>

Actual output XML:

<?xml version="1.0" encoding="utf-8"?><root>
  <element id="1" attr1="value11" attr2="value12" />
  <element id="2" attr1="newvalue21" attr2="value22" />
</root>

Desired output XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <element
    id="1"
    attr1="value11"
    attr2="value12"/>
  <element    id="2"    attr1="newvalue21"    attr2="value22"/>
</root>

Question: How can I preserve the whitespace (particularly, line breaks) of the input XML file within the "element" tags in the output XML file? I have experimented with different options, but nothing worked for this case.

Thanks for any hints!


Solution

  • The internal formatting of a tag (whitespace between attributes) is completely ephemeral in XML.

    1. As far as XML documents are concerned, it does not exist.
    2. As far as XML parsers are concerned, it is ignored, because 1). The only exception is that whitespace is illegal immediately after a <.
    3. As far as XML serializers are concerned, they can do what they want, because 1) and 2). Most (if not all) will use a single space character to separate attributes from each other.

    So...

    • Don't try to build an application that depends on the source code layout of XML.
    • Since this kind of source code layout in XML is technically irrelevant… get over your OCD. ;)