Search code examples
c#htmlxelement

How to prevent duplicate text value output by XElement.DescendantNodes()?


Problem: XElement.DescendantNodes() appears to output some parts TWICE.

Background:

I need to get the entire contents of a <body> element to copy to a new html doc within a <div> with an embedded style. This is for html mail, where embedded style should work better than a style block because many mail agents strip the <head> section. However, I run into the trouble of getting some parts TWICE. How to fix this?

Here is the example input:

<body>
  some text
  <a href="http://www.nix.com/index.html">Click Me</a>
  <br />
  <span>more text</span>
</body>

This is the output with duplicate strings, otherwise it is exactly what I need:

<body>
  <div style="font-family: Verdana; font-size: 12px;">
    some text
    <a href="http://www.nix.com/index.html">Click Me</a>
    Click Me           <<<===duplicate!!!
    <br />
    <span>more text</span>
    more text           <<<===duplicate!!!
  </div>
</body>

and this is the code, where I hoped that DescendantNodes() should be the correct method to extract both xelement nodes like <a> and text nodes like "some text":

        using System.Xml.Linq;//XElement

        XElement InputMail = 
            new XElement("body",
                "some text",
                new XElement("a",
                    new XAttribute("href", "http://www.nix.com/index.html"),
                    "Click Me"),
                new XElement("br"),
                new XElement("span", "more text"));

        XElement OutputMail =
            new XElement("body",
                new XElement("div",
                   new XAttribute("style", "font-family: Verdana; font-size: 12px;"),
                   InputMail.DescendantNodes()));

Solution

  • DescendantNodes will return really all nodes, including children, grandchild nodes and so on. That is why you see duplications - inner-most nodes are returned as part of their respective parents, plus as themselves. You need only direct child nodes, and for that you can use:

    XElement OutputMail =
          new XElement("body",
              new XElement("div",
                 new XAttribute("style", "font-family: Verdana; font-size: 12px;"),
                 InputMail.Nodes()));