Search code examples
c#.netxml.net-6.0xmldocument

Inserting an XML string in existing document with C# breaks everything


I'm using C# with the System.Xml classes to manipulate an existing XML document.

In particular, I need to insert pieces of XML that I have in string form inside specific nodes in the document.

To do this, I create a new element and then set my XML string in its InnerXml property, like this:

using System.Xml;

string nsUri = "http://myorg/myns";
string baseDocument = @$"<?xml version=""1.0"" encoding=""utf-16""?><Root xmlns=""{nsUri}""></Root>";

XmlDocument doc = new XmlDocument();
doc.LoadXml(baseDocument);
var nsm = new XmlNamespaceManager(doc.NameTable);
nsm.AddNamespace("default", nsUri);

var target = doc.CreateElement("Target", nsUri);
target.InnerXml= @"
    <Sub1>This is a subnode</Sub1>
    <Sub2>Here is another subnode</Sub2>
";
doc.DocumentElement.AppendChild(target);

doc.Save(@"C:\text.xml");

Looking at the output, here's the result:

<?xml version="1.0" encoding="utf-16"?>
<Root xmlns="http://myorg/myns">
  <Target>
    <Sub1 xmlns="">This is a subnode</Sub1>
    <Sub2 xmlns="">Here is another subnode</Sub2>
</Target>
</Root>

As you can see, there's an issue: every sub-node inside the XML string I inserted gets assigned an empty xmlns attribute: xmlns="".

This in itself is already a problem, but the far worse issue is that after doing this the document seems to be in an inconsistent state and further manipulation produces weird results. For example, if after doing this I do:

var sub1 = target.SelectSingleNode("default:Sub1", nsm);

then sub1 is null.

So, apparently this is not the correct way to do this, and the problem happens when I reassign InnerXml. Can you tell me the proper way to do this?

Some notes:

  • I know I can create the elements and attributes one by one instead of just dumping the XML string inside InnerXml, but the XML fragments I need to insert are very long and it would be absurd to manually recreate them in code.
  • I tried doing this with XmlDocumentFragment but the result is exactly the same

Solution

  • XmlElement.InnerXml apparently considers an element without an explicit namespace as belonging to the empty namespace. Therefore, the best solution is to fix the problem at the source: Ensure that whatever generates your XML string specifies the correct namespace for every element. (Otherwise, out of context, it's unclear what namespace <Sub1> and <Sub2> belong to.)

    However, if that's not feasible, and you're willing to assume that all elements without a namespace attribute in your XML string belong to a specified default namespace, then an alternative approach is to create a document fragment, set its InnerXml to your XML string in a context with the desired default namespace, and then append the fragment to the document:

    // As before...
    string nsUri = "http://myorg/myns";
    string baseDocument = @$"<?xml version=""1.0"" encoding=""utf-16""?><Root xmlns=""{nsUri}""></Root>";
    
    XmlDocument doc = new XmlDocument();
    doc.LoadXml(baseDocument);
    var nsm = new XmlNamespaceManager(doc.NameTable);
    nsm.AddNamespace("default", nsUri);
    
    // Create the <Target> element using CreateDocumentFragment instead of CreateElement:
    var fragment = doc.CreateDocumentFragment();
    fragment.InnerXml = @$"<Target xmlns=""{nsUri}"">
        <Sub1>This is a subnode</Sub1>
        <Sub2>Here is another subnode</Sub2>
      </Target>";
    var node = doc.DocumentElement.AppendChild(fragment);
    node.Attributes.RemoveNamedItem("xmlns"); // Remove the redundant xmlns attribute on <Target>.
    
    doc.Save(@"C:\text.xml");
    

    Here's the result:

    <?xml version="1.0" encoding="utf-16"?>
    <Root xmlns="http://myorg/myns">
      <Target>
        <Sub1>This is a subnode</Sub1>
        <Sub2>Here is another subnode</Sub2>
      </Target>
    </Root>