I have some data on a Sesame triplestore. When I query it using the GUI, the sequence of triples returned remains same irrespective of how many times I query it. When I try the same thing programmatically, the sequence keeps changing (although the results are the same). Can someone please explain why this is the case and what I can do to ensure that the results are returned in the same order?
This is my code:
sesameSparqlEndpoint = 'http://my.ip.ad.here:8080/openrdf-sesame/repositories/rep_name'
sparql = SPARQLWrapper(sesameSparqlEndpoint)
queryStringDownload = 'SELECT * WHERE {?s ?p ?o} LIMIT 10 OFFSET 1000'
dataGraph = Graph()
sparql.setQuery(queryStringDownload)
sparql.method = 'GET'
sparql.setReturnFormat(JSON)
output = sparql.query().convert()
print output
The order in which a SPARQL query returns its results is undefined, and any SPARQL engine is completely free to return results in any order it sees fit. Depending on the database implementation, and what techniques it uses for query optimisation, serialization, indexing, compression, etc the result for the exact same query can be in a different order each time you execute the query.
The above is true for any SPARQL engine, by the way, not just Sesame. Even if you find a database that seems to return the results in the same order every time, this is not behaviour that you should rely on, since it will not be guaranteed behaviour and whenever that database releases a new version, it may suddenly change.
However, SPARQL has a built-in operator to influence the order in which results are returned: ORDER BY
. If you wish to execute a query and be certain that the results are returned in a certain fixed order, you need to use this.
TL;DR: adapt your SPARQL query, like this:
SELECT * WHERE {?s ?p ?o} ORDER BY ?s LIMIT 10 OFFSET 1000
NB this particular query is potentially very expensive. You are asking for all triples in the database - and even though you are limiting the eventual result to 10, it may still need to range over a large part of the complete database to be able to properly order the result.