Search code examples
c#c#-4.0xmlreader

XMLReader ReadInnerXml yields unexpected results


(.Net 4.5.2 64-bit)

In trying to parse the Outlook 2007 STIG from IASE (http://iase.disa.mil/stigs/Documents/U_MicrosoftOutlook2007_V4R13_STIG.zip) using XmlReader.

I'm running into a small problem with XmlReader's ReadInnerXml() function that I don't understand (note that "path" is the path to the xml file contained in the referenced zip):

using (var sr = new StreamReader(path))
{
  using (XmlReader reader = XmlReader.Create(sr))
  {
    while (reader.Read())
    {
      if (reader.Name.Equals("Rule") && reader.NodeType != XmlNodeType.EndElement)
      {
        Console.WriteLine("Found rule: " + reader.GetAttribute("id"));
      }
    } 
  }
}

The above code returns the following list of IDs, which is expected:

Found rule: SV-18181r1_rule
Found rule: SV-18188r1_rule
Found rule: SV-18203r1_rule
Found rule: SV-18602r1_rule
Found rule: SV-18213r1_rule
Found rule: SV-35249r3_rule
Found rule: SV-18641r1_rule
Found rule: SV-18655r1_rule
Found rule: SV-18657r1_rule
Found rule: SV-18663r1_rule
Found rule: SV-18667r1_rule
Found rule: SV-18671r1_rule
Found rule: SV-18673r1_rule
Found rule: SV-18675r1_rule
Found rule: SV-18677r1_rule
Found rule: SV-18679r1_rule
Found rule: SV-18681r1_rule
Found rule: SV-18683r1_rule
Found rule: SV-18685r1_rule
Found rule: SV-18687r1_rule
Found rule: SV-18689r1_rule
Found rule: SV-18708r1_rule
Found rule: SV-18710r1_rule
Found rule: SV-18712r1_rule
Found rule: SV-18729r1_rule
Found rule: SV-18731r1_rule
Found rule: SV-18735r1_rule
Found rule: SV-18743r1_rule
Found rule: SV-18749r1_rule
Found rule: SV-18752r1_rule
Found rule: SV-18766r1_rule
Found rule: SV-18775r3_rule
Found rule: SV-18779r3_rule
Found rule: SV-18838r1_rule
Found rule: SV-18840r1_rule
Found rule: SV-62707r1_rule
Found rule: SV-18842r1_rule
Found rule: SV-18844r1_rule
Found rule: SV-18846r1_rule
Found rule: SV-18848r1_rule
Found rule: SV-18850r1_rule
Found rule: SV-18852r1_rule
Found rule: SV-18910r1_rule
Found rule: SV-18912r1_rule
Found rule: SV-18916r2_rule
Found rule: SV-18918r1_rule
Found rule: SV-18920r1_rule
Found rule: SV-18935r1_rule
Found rule: SV-18946r1_rule
Found rule: SV-18948r1_rule
Found rule: SV-18950r2_rule
Found rule: SV-18958r1_rule
Found rule: SV-18960r1_rule
Found rule: SV-18962r1_rule
Found rule: SV-18964r1_rule
Found rule: SV-18970r1_rule
Found rule: SV-18978r1_rule
Found rule: SV-18980r1_rule
Found rule: SV-18985r1_rule
Found rule: SV-18988r1_rule
Found rule: SV-18990r1_rule
Found rule: SV-18992r1_rule
Found rule: SV-18995r1_rule
Found rule: SV-19005r1_rule
Found rule: SV-19010r1_rule
Found rule: SV-19012r1_rule
Found rule: SV-19014r1_rule
Found rule: SV-19018r1_rule
Found rule: SV-19023r1_rule
Found rule: SV-19026r1_rule
Found rule: SV-19028r1_rule
Found rule: SV-19030r1_rule
Found rule: SV-19032r1_rule
Found rule: SV-19038r1_rule
Found rule: SV-19040r1_rule
Found rule: SV-19042r1_rule
Found rule: SV-19050r1_rule
Found rule: SV-19435r1_rule

However, changing the code to:

using (var sr = new StreamReader(path))
{
  using (XmlReader reader = XmlReader.Create(sr))
  {
    while (reader.Read())
    {
      if (reader.Name.Equals("Rule") && reader.NodeType != XmlNodeType.EndElement)
      {
        Console.WriteLine("Found rule: " + reader.GetAttribute("id"));
        reader.ReadInnerXml();
      }
    }
  } 
}

changes the result to:

Found rule: SV-18181r1_rule
Found rule: SV-18188r1_rule
Found rule: SV-18203r1_rule
Found rule: SV-18602r1_rule
Found rule: SV-18213r1_rule
Found rule: SV-35249r3_rule
Found rule: SV-18641r1_rule
Found rule: SV-18655r1_rule
Found rule: SV-18657r1_rule
Found rule: SV-18663r1_rule
Found rule: SV-18667r1_rule
Found rule: SV-18671r1_rule
Found rule: SV-18673r1_rule
Found rule: SV-18675r1_rule
Found rule: SV-18677r1_rule
Found rule: SV-18679r1_rule
Found rule: SV-18681r1_rule
Found rule: SV-18683r1_rule
Found rule: SV-18685r1_rule
Found rule: SV-18687r1_rule
Found rule: SV-18689r1_rule
Found rule: SV-18708r1_rule
Found rule: SV-18710r1_rule
Found rule: SV-18712r1_rule
Found rule: SV-18729r1_rule
Found rule: SV-18731r1_rule
Found rule: SV-18735r1_rule
Found rule: SV-18743r1_rule
Found rule: SV-18749r1_rule
Found rule: SV-18752r1_rule
Found rule: SV-18766r1_rule
Found rule: SV-18775r3_rule
Found rule: SV-18779r3_rule
Found rule: SV-18838r1_rule
Found rule: SV-18840r1_rule
Found rule: SV-18842r1_rule
Found rule: SV-18844r1_rule
Found rule: SV-18846r1_rule
Found rule: SV-18848r1_rule
Found rule: SV-18850r1_rule
Found rule: SV-18852r1_rule
Found rule: SV-18910r1_rule
Found rule: SV-18912r1_rule
Found rule: SV-18916r2_rule
Found rule: SV-18918r1_rule
Found rule: SV-18920r1_rule
Found rule: SV-18935r1_rule
Found rule: SV-18946r1_rule
Found rule: SV-18948r1_rule
Found rule: SV-18950r2_rule
Found rule: SV-18958r1_rule
Found rule: SV-18960r1_rule
Found rule: SV-18962r1_rule
Found rule: SV-18964r1_rule
Found rule: SV-18970r1_rule
Found rule: SV-18978r1_rule
Found rule: SV-18980r1_rule
Found rule: SV-18985r1_rule
Found rule: SV-18988r1_rule
Found rule: SV-18990r1_rule
Found rule: SV-18992r1_rule
Found rule: SV-18995r1_rule
Found rule: SV-19005r1_rule
Found rule: SV-19010r1_rule
Found rule: SV-19012r1_rule
Found rule: SV-19014r1_rule
Found rule: SV-19018r1_rule
Found rule: SV-19023r1_rule
Found rule: SV-19026r1_rule
Found rule: SV-19028r1_rule
Found rule: SV-19030r1_rule
Found rule: SV-19032r1_rule
Found rule: SV-19038r1_rule
Found rule: SV-19040r1_rule
Found rule: SV-19042r1_rule
Found rule: SV-19050r1_rule
Found rule: SV-19435r1_rule

Can someone please explain why SV-62707r1_rule is missing when I call ReadInnerXml() on each Rule? Even better, could someone please describe how to get the inner XML string of all Rule elements without it skipping one of them?


Solution

  • In your loop, you call reader.Read() on every iteration. So, when you hit Rule open element tag, you call reader.ReadInnerXml() method, which read whole Rule element, including end element tag. And right after that you call reader.Read() and skipping next node in document, and if next node is another Rule open element tag, then you miss it. As simple fix you can change if to while in your second code.