Search code examples

Is there a way to read raw content from XmlReader?

I have a very large XML file so I am using XmlReader in C#. Problem is some of the content contains XML-like markers that should not be processed by XmlReader.

<Narf name="DOH">Mark's test of <newline> like stuff</Narf>

This is legacy data, so it cannot be refactored... (of course)

I have tried ReadInnerXml but get the whole node. I have tried ReadElementContentAsString but get an exception saying 'newline' is not closed.

// Does not deal with markup in the content (Both lines)
ms.mText = reader.ReadElementContentAsString(); 
XElement el = XNode.ReadFrom(reader) as XElement; ms.mText = el.ToString();

What I want is ms.mText to equal "Mark's test of <newline> like stuff" and not an exception.

System.Xml.XmlException was unhandled
  Message=The 'newline' start tag on line 56 position 42 does not match the end tag of 'Narf'. Line 56, position 63.

The duplicate flagged question did not solve the problem because it requires changing the input to remove the problem before using the data. As stated above, this is legacy data.


  • I figured it out based on responses here! Not elegant, but works...

       public class TextWedge : TextReader
          private StreamReader mSr = null;
          private string mBuffer = "";
          public TextWedge(string filename)
             mSr = File.OpenText(filename);
             // buffer 50
             for (int i =0; i<50; i++)
                mBuffer += (char) (mSr.Read());
          public override int Peek() 
             return mSr.Peek() + mBuffer.Length;
          public override int Read()
             int iRet = -1;
             if (mBuffer.Length > 0)
                iRet = mBuffer[0];
                int ic = mSr.Read();
                char c = (char)ic;
                mBuffer = mBuffer.Remove(0, 1);
                if (ic != -1)
                   mBuffer += c;
                   // Run through the battery of non-xml tags
                   mBuffer = mBuffer.Replace("<newline>", "[br]");
             return iRet;