Is there any difference between the tow queries below?
select distinct ?i
where{
?i rdf:type <http://foo/bar#A>.
FILTER EXISTS {
?i <http://foo/bar#hasB> ?b.
?b rdf:type <http://foo/bar#B1>.
}
}
select distinct ?i
where{
FILTER EXISTS {
?i <http://foo/bar#hasB> ?b.
?b rdf:type <http://foo/bar#B1>.
}
?i rdf:type <http://foo/bar#A>.
}
There are differences regarding performance or results?
First, you do not need FILTER EXISTS
. You can rewrite your query with basic graph pattern (a set of regular triple patterns). But let's suppose you are using FILTER NOT EXISTS
or something like.
In general, order matters.
However, top-down evaluation semantics plays role mostly in case of OPTIONAL
, and that is not your case. Thus, results should be the same.
Top-down evaluation semantics can be overridden by bottom-up evaluation semantics. Fortunately, bottom-up semantics doesn't prescribe to evaluate FILTER
logically first though it is possible in case of FILTER EXISTS
and FILTER NOT EXISTS
.
SPARQL Algebra representation is the same for both queries:
(prefix ((rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)
(foobar: <http://foo/bar#>))
(distinct
(project (?i)
(filter (exists
(bgp
(triple ?i foobar:B ?b)
(triple ?b rdf:type foobar:B1)
))
(bgp (triple ?i rdf:type foobar:A))))))
Naively following top-down semantics, an engine should evaluate ?i a foobar:A
first.
?i
.?i
whereas subpattern is much more selective.Fortunately, optimizers try to reorder patterns depending on their selectivity. However, predictions can be erroneous.
By the way, the rdf:type
predicate is said to be a performance killer in Virtuoso.
Results can be different, if an endpoint has a query execution time limit and flushes partial results when timeout is reached: an example.