Search code examples
c#asp.netxmldocument

How to replace nodes in System.Xml.XmlDocument?


Consider the following xml:

<div>
   <a href="http://www.google.com/">This is:</a>
   <p>A test... <b>1</b><i>2</i><u>3</u></p>
   <p>This too</p>
   Finished.
</div>

The content of this xml is located in a System.Xml.XmlDocument instance. I need to replace all p elements and add a break after each paragraph element. I've written the following code:

var pElement = xmlDocument.SelectSingleNode("//p");
while (pElement != null)
{
    var textNode = xmlDocument.CreateTextNode("");
    foreach (XmlNode child in pElement.ChildNodes)
    {
        textNode.AppendChild(child);
    }    
    textNode.AppendChild(xmlDocument.CreateElement("br"));
    pElement.ParentNode.ReplaceChild(pElement, textNode);
    pElement = xmlDocument.SelectSingleNode("//p");
}

I'm creating an empty node and adding the child nodes of each paragraph node to it. Unfortunately this doesn't work: a text node can't contain elements.

Any ideas how to implement this replace?


Solution

  • Looks like I found a solution using the InsertAfter method:

    var pElement = xmlDocument.SelectSingleNode("//p");
    
    while (pElement != null)
    {    
        //store position where new elements need to be added
        var position = pElement;
    
        while(pElement.FirstChild != null)
        {
            var child = pElement.FirstChild;
            position.ParentNode.InsertAfter(child, position);
    
            //store the added child as position for next child
            position = child;
        }
    
        //add break
        position.ParentNode.InsertAfter(xmlDocument.CreateElement("br"), position);
    
        //remove empty p
        pElement.ParentNode.RemoveChild(pElement);
    
        //select next p
        pElement = xmlDocument.SelectSingleNode("//p");
    }
    

    The idea is as follows:

    1. Look through all p nodes.
    2. Loop through all child nodes of p.
    3. Add them to the correct position.
    4. Add a break after each p node.
    5. Remove p element.

    The position was quite tricky to find. The first child node needs to be added to the parent node of p by using an InsertAfter with p as the positional element. But the next child needs to be added at the after the previously added child. Solution: save its position and use it.

    Note: using a for each iterator on the pElement.ChildNodes collection won't work, because after moving halve of the nodes, the iterator decides that it is done. Looks like it uses a count of some sort instead of a collection of objects.