Search code examples
pythonrdfsparqlrdflibredland

SPARQL query on the remote remote endpoint RDFLib / Redland


I'm trying to query remote endpoints and get get owl:sameAs mappings, I've tried both RDFLib and Redland but neither worked for me, probably I'm not dealing with namespaces correctly.

Here is my attempt in RDFLib:

    import rdflib

    rdflib.plugin.register('sparql', rdflib.query.Processor, 'rdfextras.sparql.processor', 'Processor')
    rdflib.plugin.register('sparql', rdflib.query.Result, 'rdfextras.sparql.query', 'SPARQLQueryResult')

    g = rdflib.Graph()

    query = """
        SELECT *
        FROM <http://api.talis.com/stores/bbc-backstage/services/sparql>
        WHERE {
             ?s a http://purl.org/ontology/mo/MusicArtist;
                http://www.w3.org/2002/07/owl#sameAs ?o .
        }Limit 50
    """

    for row in g.query(query):
        print row

And here is Redland:

import RDF
model = RDF.Model()

query = """
    SELECT *
    FROM <http://api.talis.com/stores/bbc-backstage/services/sparql>
    WHERE {
         ?s a http://purl.org/ontology/mo/MusicArtist;
            http://www.w3.org/2002/07/owl#sameAs ?o .
    }Limit 50
"""

for statement in RDF.Query(query ,query_language="sparql").execute(model):
    print statement

Can you please give a hint what is wrong in any one of those? Yet another difficulty I have: Is it possible to get dataset name of the object? For example: if there is:

?s = http://www.bbc.co.uk/music/artists/eb5c8564-927d-414d-b152-c7b48a2c9d8b#artist
predicate = http://www.w3.org/2002/07/owl#sameAs
?0 = http://dbpedia.org/resource/The_Boy_Least_Likely_To

Can I get name of the "Dbpedia" in this example? Or any other dataset to which I'm having sameAs link? (Or probably I could just look-up interested dataset names in the object string) thank you very VERY much in advance


Solution

  • Various things:

    You are right, you need to enclose any URI within < >. The correct query is:

    SELECT ?s ?o WHERE {
             ?s a <http://purl.org/ontology/mo/MusicArtist>;
                <http://www.w3.org/2002/07/owl#sameAs> ?o .
        } limit 50
    

    ... see the results here.

    FROM is not implemented in rdflib or redland as you think it is. It does not fetch remote SPARQL endpoints it fetches remote graphs or graphs with that name in a local store. In your case you want to use SERVICE see how it works here with Jena. Unfortunately, neither rdflib nor redland implement the SERVICE clause for SPARQL but there are workarounds to sort this out.

    One possible solution is to use SPARQLWrapper for python. It is trivial, here you have your code with that library:

    from SPARQLWrapper import SPARQLWrapper, JSON
    
    sparql = SPARQLWrapper("http://api.talis.com/stores/bbc-backstage/services/sparql")
    sparql.setQuery("""
        SELECT ?s ?o
        WHERE {
             ?s a <http://purl.org/ontology/mo/MusicArtist>;
                <http://www.w3.org/2002/07/owl#sameAs> ?o .
        } limit 50
    """)
    sparql.setReturnFormat(JSON)
    results = sparql.query().convert()
    
    for result in results["results"]["bindings"]:
        print result["s"]['value'], result["o"]['value']
    

    As you can see the remote SPARQL endpoint becomes a parameter outside the query.