Search code examples
.net.net-4.0linq-to-xmlxmlreader

XmlReader not recognizing EndElement


I have an XML string that has no formatting similar to:

<SomeTag><Tag>tag 1</Tag><Tag>tag 2</Tag><Tag>tag 3</Tag><Tag>tag 4</Tag></SomeTag>

When I run this code:

using (XmlReader reader = XmlReader.Create(stream))
            {
                reader.MoveToContent();

                while (reader.Read())
                {
                    if ((reader.NodeType == XmlNodeType.Element) && (string.Compare(reader.Name, name, StringComparison.InvariantCultureIgnoreCase) == 0))
                    {
                        var element = (XElement)XNode.ReadFrom(reader);
                        yield return element;
                    }
                }
                reader.Close();
            }

It only recognizes node's tag 1 and tag 3 as Element and recognizes tag 2 and tag 4 as TextNodes.

Why?

What do I do to fix it?

FYI, if I add formatting with line feeds after each tag it works as expected, recognizing all tags as elements. However, I do not have control over the XML that is given to me.


Solution

  • I suspect the problem is that XNode.ReadFrom is already positioning the reader "on" the start of the next element - you're then calling Read, and it's moving over the element and onto the next node.

    That's just a guess though - it's the sort of thing that XmlReader makes tricky :( Try making the Read call conditional on whether you've just called ReadFrom. Something like this:

    using (XmlReader reader = XmlReader.Create(stream))
    {
        reader.MoveToContent();
    
        while (!reader.EOF)
        {
            if (reader.NodeType == XmlNodeType.Element &&
                reader.Name.Equals(name, StringComparison.InvariantCultureIgnoreCase))
            {
                var element = (XElement)XNode.ReadFrom(reader);
                yield return element;
            }
            else
            {
                reader.Read();
            }
        }
    }