Search code examples
indexingsparqlgraphdb

write more efficient sparql query


I'm using GraphDB and the triple store is spatially indexed.

When I'm using this query, called Q1:

PREFIX geo-pos: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX omgeo: <http://www.ontotext.com/owlim/geo#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
select ?a ?lat ?long 
WHERE {
    ?a omgeo:within(22.92 -142.38 75.23 183.69) . 
    ?a geo-pos:lat ?lat . 
    ?a geo-pos:long ?long .

} limit 5000

It only takes less than a second, omgeo:within(22.92 -142.38 75.23 183.69) is using the spatial index of the triple store.

Also, when I use this query, called Q2:

PREFIX geo-pos: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX omgeo: <http://www.ontotext.com/owlim/geo#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
select ?a ?lat ?long 
WHERE {
    ?a a ?o .
    filter(?o = someclass) .
    ?a geo-pos:long ?long .

} limit 5000

or this query, called Q3:

PREFIX geo-pos: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX omgeo: <http://www.ontotext.com/owlim/geo#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
select ?a ?lat ?long 
WHERE {
    ?a a someclass .    
    ?a geo-pos:lat ?lat . 
    ?a geo-pos:long ?long .
} limit 5000

They return the same results and both take about 1 second.

But if I use this query, called Q4:

PREFIX geo-pos: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> 
PREFIX omgeo: <http://www.ontotext.com/owlim/geo#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 
select ?a ?lat ?long 
WHERE {
    ?a omgeo:within(22.92 -142.38 75.23 183.69) . 
    ?a a ?o .
    filter(?o = someclass) .
    ?a geo-pos:lat ?lat . 
    ?a geo-pos:long ?long .

} limit 5000

It takes more than 60 seconds. Do you know why this happens? Even if Q2 and Q3 returns 0 result, which means that the someclass I queried about has no instance, Q4 still takes more than 60 seconds. Is there a more efficient way to write Q4?


Solution

  • If a query like your first two runs sufficiently quickly, and your intent is just to filter down the results, a query like the one you've written should do it for you (as far as I can tell). However, you could also combine the queries by making one a subquery. This shouldn't make a difference, but it might help. I.e., you can do something like:

    select ?a ?lat ?long {
      values ?o { <some-class> }
      ?a a ?o .
      { select ?a ?lat ?long  {
          ?a omgeo:within(22.92 -142.38 75.23 183.69) . 
          ?a geo-pos:lat ?lat . 
          ?a geo-pos:long ?long .
        } limit 5000 }
    }