Search code examples
sparql

OFFSET in sparql


I have a request to count the number of records; the request returns 129980 records

SELECT count distinct ?url
  WHERE {
  ?url a dbo:Film.
  } 

because each time SPARQL returns only 10000 records; So I have to use "offset".

SELECT distinct ?url
  WHERE {
  ?url a dbo:Film.
  }limit 10000 offset 1000

Question: if I want to take all the records, I need to set offset =12; But why when I set offset = 1000 I still got 1000 records. Thank for your responding so much. I appreciate your help.


Solution

  • Note that your first query uses invalid SPARQL syntax. You only get a result because the engine you're querying (if you're querying DBpedia, as it appears, that's Virtuoso) is very forgiving of many errors. Correct and complete syntax would be --

    PREFIX dbo: <http://dbpedia.org/ontology/>
    SELECT ( COUNT ( DISTINCT ?url ) AS ?HowManyFilms )
      WHERE {
      ?url a dbo:Film .
      } 
    

    Things to know, for your second query --

    1. OFFSET means "skip this many rows from the total result set"
    2. LIMIT means "only give me this many rows (starting after any OFFSET)"
    3. Rows may be delivered in any order, and this ordering may change from query-to-query, if you don't include an ORDER BY. This can mean that multiple queries with different OFFSET may not get you all rows, and may deliver duplicate rows, when all the partial result sets are combined. So -- anytime you're using OFFSET and/or LIMIT, it's best practice to also use an ORDER BY.

    All together, add this to the first query to get the first 10,000 rows--

    ORDER BY ?url LIMIT 10000 OFFSET 0
    

    -- and this to get the last 9,980 rows --

    ORDER BY ?url LIMIT 10000 OFFSET 120000
    

    I leave the intermediary queries for you...