I am creating a class hierarchy based on RDFS/OWL, and creating instances in all the classes using the a
(rdf:type
) relationship. I want to retrieve instances of a specific class, not including the instances of its children. However, when I write a SPARQL query, it gives me all the instances of every child class as well.
My ontology says:
Book
is a class, which has two subclasses: hard_bounded_book
and soft_binded_books
In other words (with some instances):
@prefix ex: <http://book_triples.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
ex:hard_bounded_book1
a ex:hard_bounded_book .
ex:soft_binded_books1a
a ex:soft_binded_books .
ex:soft_binded_books rdfs:subClassOf ex:Book .
ex:hard_bounded_book rdfs:subClassOf ex:Book .
ex:Book a rdf:Class .
ex:Book1 a ex:Book .
when I query
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://book_triples.org/>
SELECT ?book
WHERE
{ ?book rdf:type ex:Book }
It returns all three Book1
, hard_bounded_book1
, soft_binded_books1
, but I would like to get only the first result (Book1
).
Any help is appreciated. Thanks.
Regardless of whether inferencing is turned on or off in your store, you can write a query that only returns the instances of the specific class, by
filtering out all the instances that are also instances of a subclass of ex:Book
, like so:
SELECT ?book
WHERE {
?book rdf:type ex:Book
FILTER NOT EXISTS {
?book rdf:type ?c .
?c rdfs:subClassOf+ ex:Book .
FILTER (?c != ex:Book)
}
It checks that for every book returned, a triple making that book an instance of a subclass of ex:Book
does not exist. The second filter (checking that ?c
is not equal to ex:Book
) is necessary because in RDFS, every class is a subclass of itself.
Of course, this query is more expensive to run than the simple original you had, so if your triplestore has an option to (temporarily) turn off inferencing, that might be a preferable solution.
As an aside: the +
sign behind the subClassOf
pattern is a "1 or more levels deep" operator, and is optional here. You need to include it if you wish to be rigorous about excluding all possible instances of subclasses, even if the reasoner has completed all inferences. Given that in your scenario it's likely there's a reasoner that infers the complete deductive closure, you can probably leave it out.
Update To explain in a bit more detail my point about the +
sign: imagine that we have classes A, B, and C: B is a subclass of A, and C is a subclass of B.
Imagine an individual x, which is declared to be an instance of C.
+
sign will work to remove x from the
result. So far so good. However, imagine that we also insert the explicit fact that x is an instance of A.
If inferencing is enabled, we're still fine with the query without a +
operator. However, without inferencing, the query for all instances of A will now return x, even though x is also an instance of an (indirect) subclass of A (namely, C). That is the edge case for which the +
operator is helpful.