Search code examples
randomsamplesparqldbpedia

How to select random DBPedia nodes from SPARQL?


How can I select random sample from DBpedia using the sparql endpoint?

This query

SELECT ?s WHERE { ?s ?p ?o . FILTER ( 1 > bif:rnd (10, ?s, ?p, ?o) ) } LIMIT 10

(found here) seems to work ok on most SPARQL endpoints, but on http://dbpedia.org/sparql it gets cached (so it returns always the same 10 nodes).

If i try from JENA, I get the following exception:

Unresolved prefixed name: bif:rnd

And I can't find the what the 'bif' namespace is.

Any idea on how to solve this?

Mulone


Solution

  • bif:rnd is not SPARQL standard and therefore not portable to any SPARQL endpoint. You can use LIMIT , ORDER and OFFSET to simulate a random sample with a standard query. Something like ...

    SELECT * WHERE { ?s ?p ?o } 
    ORDER BY ?s OFFSET $some_random_number$ LIMIT 10
    

    Where some_random_number is a number that is generated by your application. This should avoid the caching problem but this query is anyway quite expensive and I don't know if public endpoints will support it.

    Try to avoid completely open patterns like ?s ?p ?o and your query will be much more efficient.