Search code examples
.netxmlvb.netxmldocument

XmlDocument.LoadXml() hanging for several minutes


I have a routine that parses an XML response from an HTTP request and I use XmlDocument.LoadXml to help do this. I count on this method throwing an exception on bad XML and returning a loaded up XmlDocument object when successful.

What I didn't expect is for it to hang for several minutes loading a document. When I run this code in a test environment, it hangs for several minutes 100% of the time. Looks like some bug in .NET to me...

    Dim tstring As String = ""

    tstring &= "" & vbCrLf
    tstring &= "" & vbCrLf
    tstring &= "<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"">" & vbCrLf
    tstring &= "" & vbCrLf
    tstring &= "<html> xmlns=""http://www.w3.org/1999/xhtml"" >" & vbCrLf
    tstring &= "<head><title>" & vbCrLf
    tstring &= "    Error" & vbCrLf
    tstring &= "</title></head>" & vbCrLf
    tstring &= "<body>" & vbCrLf
    tstring &= "</body>" & vbCrLf
    tstring &= "</html>" & vbCrLf

    Dim MyXmlDoc As New XmlDocument
    MyXmlDoc.LoadXml(tstring)

The specific line in the document that can be removed to keep it from hanging is:

    tstring &= "<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.0 Transitional//EN"" ""http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"">" & vbCrLf

Am I going to have to search for "<!DOCTYPE html" in the string and not call LoadXml() if I see it? My concern about this is what other gotchas are waiting for me inside this method?


Solution

  • the loadxml call is parsing the doctype for validation purposes so it must fetch that url - that is slow in this case. You can test directly in your browser.

    Another question provides a workaround - to quote:

    in .NET 4.0 XmlTextReader has a property called DtdProcessing. When set to DtdProcessing.Ignore it should disable DTD processing.

    and

    doc.XmlResolver = null;

    for .NET 3.5 should work.