Search code examples
c#xmlxmlreader

View all text of an element with XmlReader C#


I'm using an XmlReader to iterate through some XML. Some of the XML is actually HTML and I want to get the text content from the node.

Example XML:

<?xml version="1.0" encoding="UTF-8"?>
<data>
  <p>Here is some <b>data</b></p>
</data>

Example code:

using (XmlReader reader = new XmlReader(myUrl))
{
  while (reader.Read()) 
  {
    if (reader.Name == "p")
    { 
      // I want to get all the TEXT contents from the this node
      myVar = reader.Value;
    }
  }
}

This doesn't get me all the contents. How do I get all the contents from the

node in that situation?


Solution

  • Use ReadInnerXml:

            StringReader myUrl = new StringReader(@"<?xml version=""1.0"" encoding=""UTF-8""?>
    <data>
      <p>Here is some <b>data</b></p>
    </data>");
            using (XmlReader reader = XmlReader.Create(myUrl))
            {
                while (reader.Read())
                {
                    if (reader.Name == "p")
                    {
                        // I want to get all the TEXT contents from the this node
                        Console.WriteLine(reader.ReadInnerXml());
                    }
                }
            }
    

    Or if you want to skip the <b> as well, you can use an aux reader for the subtree, and only read the text nodes:

            StringReader myUrl = new StringReader(@"<?xml version=""1.0"" encoding=""UTF-8""?>
    <data>
      <p>Here is some <b>data</b></p>
    </data>");
            StringBuilder myVar = new StringBuilder();
            using (XmlReader reader = XmlReader.Create(myUrl))
            {
                while (reader.Read())
                {
                    if (reader.Name == "p")
                    {
                        XmlReader pReader = reader.ReadSubtree();
                        while (pReader.Read())
                        {
                            if (pReader.NodeType == XmlNodeType.Text)
                            {
                                myVar.Append(pReader.Value);
                            }
                        }
                    }
                }
            }
    
            Console.WriteLine(myVar.ToString());