Search code examples
xmldelphixmlhttprequestdelphi-5serverxmlhttp

IXMLHttpRequest.responseXml is empty, with no parse error, when responseText contains valid Xml


i am fetching some XML from a government web-site:

http://www.bankofcanada.ca/stats/assets/rates_rss/noon/en_all.xml

i am using the following, fairly simple code:

var
   szUrl: string;
   http: IXMLHTTPRequest;
begin
   szUrl := 'http://www.bankofcanada.ca/stats/assets/rates_rss/noon/en_all.xml';

   http := CoXMLHTTP60.Create;
   http.open('GET', szUrl, False, '', '');
   http.send(EmptyParam);

   Assert(http.Status = 200);

   Memo1.Lines.Add('HTTP/1.1 '+IntToStr(http.status)+' '+http.statusText);
   Memo1.Lines.Add(http.getAllResponseHeaders);
   Memo1.Lines.Add(http.responseText);

i won't show all the body that returns, but it does return valid xml in the responseText:

HTTP/1.1 200 OK
Cache-Control: max-age=5
Connection: keep-alive
Connection: Transfer-Encoding
Date: Fri, 30 Mar 2012 14:50:50 GMT
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
Expires: Fri, 30 Mar 2012 14:50:55 GMT
Server: Apache/2.2.16 (Unix) PHP/5.3.3 mod_ssl/2.2.16 OpenSSL/1.0.0d mod_perl/2.0.4 Perl/v5.12.0
X-Powered-By: PHP/5.3.3


<?xml version="1.0" encoding="ISO-8859-1"?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns="http://purl.org/rss/1.0/"
    xmlns:cb="http://www.cbwiki.net/wiki/index.php/Specification_1.1"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:xsi="http://www.w3c.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.w3c.org/1999/02/22-rdf-syntax-ns#rdf.xsd">
    <channel rdf:about="http://www.bankofcanada.ca/stats/assets/rates_rss/noon/en_ALL.xml">
        <title xml:lang="en">Bank of Canada: Noon Foreign Exchange Rates</title>
        <link>http://www.bankofcanada.ca/rates/exchange/noon-rates-5-day/</link>

Okay, fine, there's valid xml in there. i know it's valid because...well just look at it. But i also know it's valid by parsing it:

var
   ...
   szXml: WideString;
   doc: DOMDocument60;
begin
   ...
   szXml := http.responseText;
  
   doc.loadXML(szXml);
   Assert(doc.parseError.errorCode = 0);

   Memo1.Lines.Add('============parsed xml');
   Memo1.Lines.Add(doc.xml);

The origianal IXmlHttpRequest contains a responseXml property. From MSDN:

Represents the parsed response entity body.

If the response entity body is not valid XML, this property returns DOMDocument that was parsed so that you can access the error. This property does not return IXMLDOMParseError itself, but it is accessible from DOMDocument.

In my case the responseXml property exists, as it should:

Assert(http.responseXml <> nil);

And there is no parse error of responseText:

doc := http.responseXml as DOMDocument60;
Assert(doc.parseError.errorCode = 0);

as there should be, since the xml is valid.

Except that when i look at the http.responseXml document object, it's empty:

   Memo1.Lines.Add('============responseXml');
   Memo1.Lines.Add(doc.xml);

Is is IXMLHttpRequest (and IXMLServerHttpRequest) returning an empty XML document, when:

  • there is xml
  • the xml is valid
  • there is no parse error

In long form:

uses
    msxml2_tlb;

procedure TForm1.Button1Click(Sender: TObject);
var
    szUrl: string;
    http: IXMLHTTPRequest;
    doc: DOMDocument60;
begin
    szUrl := 'http://www.bankofcanada.ca/stats/assets/rates_rss/noon/en_all.xml';

    http := CoXMLHTTP60.Create; //or CoServerXmlHttpRequest.Create
    http.open('GET', szUrl, False, '', '');
    http.send(EmptyParam);

    Assert(http.Status = 200);

    doc := http.responseXml as DOMDocument60;
    Assert(doc.parseError.errorCode = 0);

    ShowMessage('"'+doc.xml+'"');
end;

How do i make XmlHttpRequest (and more importantly ServerXMLHTTP60) behave as documented?


Solution

  • Ii found the problem

    i used Fiddler to save the http response to a text file. After that i could modify the response file, and instruct fiddler to serve my hand-crafted alternatives, rather than going to the original web-site.

    enter image description here

    After 3 hours of fiddling, i managed to track down the problem in the original http response headers:

    HTTP/1.1 200 OK
    Cache-Control: max-age=5
    Connection: keep-alive
    Connection: Transfer-Encoding
    Date: Fri, 30 Mar 2012 14:50:50 GMT
    Transfer-Encoding: chunked
    Content-Type: text/html; charset=UTF-8
    Expires: Fri, 30 Mar 2012 14:50:55 GMT
    Server: Apache/2.2.16 (Unix) PHP/5.3.3 mod_ssl/2.2.16 OpenSSL/1.0.0d mod_perl/2.0.4 Perl/v5.12.0
    X-Powered-By: PHP/5.3.3
    

    should be:

    HTTP/1.1 200 OK
    Cache-Control: max-age=5
    Connection: keep-alive
    Connection: Transfer-Encoding
    Date: Fri, 30 Mar 2012 14:50:50 GMT
    Transfer-Encoding: chunked
    Content-Type: text/xml; charset=UTF-8
    Expires: Fri, 30 Mar 2012 14:50:55 GMT
    Server: Apache/2.2.16 (Unix) PHP/5.3.3 mod_ssl/2.2.16 OpenSSL/1.0.0d mod_perl/2.0.4 Perl/v5.12.0
    X-Powered-By: PHP/5.3.3
    

    Once i found the problem, i was able to back-find the documentation that explain the behavior:

    The supported MIME types for MSXML 6.0 are:

    • "text/xml"
    • "application/xml"
    • or anything that ends with "+xml", for example "application/rss+xml"

    The RSS feed i'm fetching is actually a Resource Definition Format (RDF) feed, where the content type is supposed to be:

    application/rdf+xml
    

    Their use of:

    text/html
    

    is wrong on so many levels.

    So the behavior i'm experiencing is by design; although frustrating - as there's no easy way to know if the responseXml is "valid".

    • the responseXml object will be assigned
    • the parseError object will be assigned
    • the parseError.ErrorCode is zero
    • the responseXml.documentElement will be nil