Search code examples
xmlrdfsesame

sesame open rdf - XML parsing ERROR


I am trying to use sesame open rdf to download postcode and lsoa linked web data. I created a new repository, type = SPARQL endpoint proxy and added

http://opendatacommunities.org/sparql

As the end point, I then run a query:

 PREFIX pc: <http://data.ordnancesurvey.co.uk/ontology/postcode/>
 PREFIX geo: <http://opendatacommunities.org/def/geography#>
 SELECT * WHERE { 
        ?postcodeUnit
        a pc:PostcodeUnit;
        geo:lsoa ?lsoa .
 }
 LIMIT 10

Which brings back an error of:

XML Parsing ERROR: no element found, Line Number 1, Column 1:

I can get the query working in R, but need to use a web service to download all the data, R times out if there is not a limit on.

So I am trying to run the above query through the endpoint, but get the error. I set up and ran a query using the :http://dbpedia.org/sparql endpoint which works fine. So I am wondering if anyone has had a error similar to this one before.


Solution

  • The cause of this seems to be a bug in the SPARQL endpoint at opendatacommunities.org.

    When it sends a response to the query, it sends the response in JSON format, but in the HTTP response header, it says the response is in XML format. It seems to behave a bit erratically - I tried a few test requests and sometimes it does send the correct response header, but other times it doesn't.

    Sesame looks at the Content-Type response header to determine which parser to use to process the result, and when the header is wrong, grabs the wrong parser - and you get a processing error.

    Given the erratic behavior, I'm not really sure there's anything that can be done at the client's end. I'm afraid the only solution is get in touch with the endpoint's maintainers and ask them to fix the bug at their end.

    ...of course what you can do at the client's end (at least programmatically) is override the default request headers that Sesame sends to the SPARQL endpoint, to make sure the endpoint sends back a correct response. But this will require quite a bit of trial-and-error as I haven't yet figured out what the endpoint expects.