Search code examples
sparqlsemantic-websesametriplestorelinked-data

Unable to use SPARQL SERVICE with FactForge


I am trying to access FactForge from Sesame triplestore. This is the query:

select * 

where{
SERVICE <http://factforge.net/sparql>{
?s ?p ?o
}

}
LIMIT 100

The query doesn't get executed. The same structure works with DBpedia. FactForge's SPARQL endpoint on the web is working. What do I need to do to access the endpoint successfully from Sesame?


Solution

  • What you need to do is write a more meaningful (or at least more constrained) query. Your query is just selecting all possible triples, which presumably puts a lot of stress on the factforge endpoint (which contains about 3 billion triples). The reason your query "does not get executed" (which probably means that you are just waiting forever for the query to return a result) is that it takes the SPARQL endpoint a very long time to return its response.

    The LIMIT 100 you put on the query is outside the scope of the SERVICE clause, and therefore not actually communicated to the remote endpoint you're querying. While in this particular case it would be possible for Sesame's optimizer to add that (since there are no additional constraints in your query outside the scope of the SERVICE clause), unfortunately it's currently not that smart - so the query sent to factforge is without a limit, and the actual limit is only applied after you get back the result (which, in the case of your "give me all your triples" query, naturally takes a while).

    However, demonstrably the SERVICE clause does work for FactForge when used from Sesame, because if you try a slightly more constrained query, for example a query selecting all companies:

    PREFIX dbp-ont: <http://dbpedia.org/ontology/>
    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    select * 
    where{
      SERVICE <http://factforge.net/sparql>{
        ?s a  dbp-ont:Company
      } 
    } LIMIT 100
    

    it works fine, and you get a response.

    More generally speaking, I should recommend that if you want to do queries that are specifically to a particular SPARQL endpoint, you should use a SPARQL endpoint proxy (which is one of the repository types available in Sesame) instead of using the SERVICE clause. SERVICE is only really useful when trying to combine data from your local repository with that of a remote endpoint in a single query. Using a SPARQL endpoint proxy allows you to make sure LIMIT clauses are actually communicated to the endpoint, and just generally will give you better performance than a SERVICE query will.