Search code examples
pythonpython-3.xsparqlvirtuososparqlwrapper

python SPARQLWrapper return only 10000 results


I use the SPARQLWrapper module to launch a query to a virtuoso endpoint and get the result.

The query always return a maximum of 10000 results

Here is the python script:

from SPARQLWrapper import SPARQLWrapper, JSON 

queryString = """ 
SELECT DISTINCT ?s
WHERE {
    ?s ?p ?o .
}
"""


sparql = SPARQLWrapper("http://localhost:8890/sparql")
sparql.setQuery(queryString)
sparql.setReturnFormat(JSON)

res = sparql.query().convert()

# Parse result
parsed = []
for entry in res['results']['bindings']:
    for sparql_variable in entry.keys():
        parsed.append({sparql_variable: entry[sparql_variable]['value']})

print('Query return ' + str(len(parsed)) + ' results')

When I lauch the query with

SELECT count(*) AS ?count

I get the right number of triples : 917051.

Why the SPARQLWrapper module limit the number of result to 10000 ?

How do I get all the results ?


Solution

  • The answer is to adjust the Virtuoso configuration file, as documented. Specifically for this case, you need to increase the ResultSetMaxRows in the [SPARQL] stanza.

    The limit is not in SPARQLWrapper. You would see the same limit if you did the full SELECT (instead of the COUNT, which only delivers 1 row) through the SPARQL endpoint, Conductor, or any other interface.