I'm trying to load a piece of (possibly) malformed HTML into an XMLDocument object, but it fails with XMLExceptions... since there are extra opening/closing tags, and malformed XML tags such as <img >
instead of <img />
How do I get the XML to parse with all the errors in the data? Is there any XML validator that I can apply before parsing, to correct these errors? Or would handling the exception parse whatever can be parsed?
The HTML Agility Pack will parse html, rather than xhtml, and is quite forgiving. The object model will be familiar if you've used XmlDocument
.