Search code examples
c#xmlcdataxmlwriter

Why are additional elements added when calling XmlWriter.WriteCData(String) method?


I need to write the value of other xml nodes containing CDATA sections into the CDATA section of an XML node (I use .NET 6)

For example, I need to generate and write the following XML:

<?xml version="1.0" encoding="utf-8"?>
<items>
      <item><![CDATA[<some_node><![CDATA[cdata_value]]></some_node>]]</item>
</items>

To generate the XML, I use the following code:

string str_xml = String.Empty;
    
using (MemoryStream m_stream = new MemoryStream())
{
    using (XmlWriter xml_writer = XmlWriter.Create(m_stream, new XmlWriterSettings { Encoding = Encoding.UTF8, Indent = true, OmitXmlDeclaration = false }))
    {
        xml_writer.WriteStartElement("items");
        xml_writer.WriteStartElement("item");
        xml_writer.WriteCData("<some_node><![CDATA[cdata_value]]></some_node>");
        xml_writer.WriteFullEndElement();
    
        xml_writer.WriteEndElement();
        xml_writer.Flush();
    }
    
    m_stream.Position = 0;
    using (StreamReader stream_reader = new StreamReader(m_stream))
    {
        str_xml = stream_reader.ReadToEnd();
    }
}

The value of the str_xml variable in the debugger:

<?xml version="1.0" encoding="utf-8"?>
    <items>
       <item>
          <![CDATA[<some_node><![CDATA[cdata_value]]]]><![CDATA[></some_node>]]>
       </item>
    </items>

Why is the value in the CDATA section of the item node incorrectly formed?


Solution

  • You're trying to produce something that isn't XML, so it's not surprising that it doesn't work. CDATA sections in XML can't be nested.

    What's happening is that when you try to write CDATA containing ]]>, the XMLWriter knows that that wouldn't be well-formed because it would end the CDATA section, so it closes the first CDATA section and starts a new one. There are various ways of handling this and different libraries will do it differently, but they will all end up with something along these lines.

    The inner CDATA content, cdata_value doesn't need to be in CDATA at all because it doesn't contain any special characters like < or &.