Search code examples
javasparqldbpedia

sparql using live.dbpedia and getting XML Schema in the result


I'm adventuring with sparql and a java application, I've found a few connection basics to get up and running as it were but fear I'm making a mistake which will later turn into something worse.

Every connection suggestion regardless of the library used says to connect to "http://dbpedia.org/sparql/" yet this doesn't work for me.

I checked the url that is returned when I run a query using the online editor and noticed the live prefix, so I added that as my connection string, and it works. That is to say, my connection string looks like "http://live.dbpedia.org/sparql"

And it does return the result, however, the result has the XML Schema attached which is making me wonder whether or not it's because of this live. I've added in.

Below is the simple connection code I'm using, is this correct? Any and all help greatly appreciated thank you.

If the 'live' is correct, is it possible to extra the just the value wihtout the Schema?

StringBuilder sb = new StringBuilder();
    sb.append("PREFIX dbr: <http://dbpedia.org/resource/> \n");
    sb.append("PREFIX dbp: <http://dbpedia.org/property/> \n");
    sb.append("PREFIX dbo: <http://dbpedia.org/ontology/> \n");
    sb.append("SELECT ?dob \n");
    sb.append("WHERE {dbr:Tony_Blair dbp:birthDate ?dob} \n");

    Query query = QueryFactory.create(sb.toString());
    QueryExecution qexec = QueryExecutionFactory.sparqlService("http://live.dbpedia.org/sparql", query);

    try {
        ResultSet results = qexec.execSelect();
        for ( ; results.hasNext() ; )
        {
            QuerySolution soln = results.nextSolution() ;
            System.out.println(soln.get("?dob"));
        }

the result being:

1953-05-06^^http://www.w3.org/2001/XMLSchema#date

Solution

  • Well the result as you show it is missing some brackets and quotes, but I assume that is caused by how you copy-pasted it. More usually it would look like this:

    "1953-05-06"^^<http://www.w3.org/2001/XMLSchema#date>
    

    But in essence your query and code is correct. The "attached XML Schema" here is the datatype of the returned literal string.

    An RDF literal consists of a lexical value (in your case "1953-05-06") and a datatype (in your case http://www.w3.org/2001/XMLSchema#date). It can also, optionally have a language tag e.g. "colour"@en-UK.

    If you wish to remove the datatype from the result and only retrieve the lexical value, you can use the STR() function as part of the SELECT clause in your query:

    SELECT (STR(?dob) as ?date_of_birth)
    

    As for the connection string that you are struggling with: there are two separate DBPedia endpoints. The "regular" one is http://dbpedia.org/sparql (no trailing slash) - this queries a static dataset that is synced/updated with Wikipedia changes every so 6 months or so. The "live" endpoint, http://live.dbpedia.org/sparql, is an effort to have a more up-to-date dataset ready for querying. See https://wiki.dbpedia.org/online-access/DBpediaLive for more details.