Search code examples
semantic-webontologysesametriplestore

SPARQL UPDATE validation


I have a Sesame triplestore with an ontology imported in it.

I know I can do SPARQL Update operations on it by inserting instances, deleting instances and updating things and stuff.

But what if these operations are used in wrong way, like inserting an invalid triple that has no logic and does not respect the ontology rules. A triple like:

foo:Anna foo:likesToEat foo:arsenic.  

And the ontology looks like this:

@prefix foo: <http://www.foo.org/ontologies/example#>.

foo:Anna rdf:type foo:Person.
foo:Anna rdf:type owl:NamedIndividual.
foo:Food rdf:type owl:Class.
foo:Metal rdf:type owl:Class.
foo:Person rdf:type owl:Class.
foo:arsenic rdf:type foo:Metal.
foo:arsenic rdf:type owl:NamedIndividual.
foo:likesToEat rdf:type owl:ObjectProperty.
foo:likesToEat rdfs:domain foo:Person.
foo:likesToEat rdfs:range foo:Food.
foo:pizza rdf:type foo:Food.
foo:pizza rdf:type owl:NamedIndividual.

As you can see the triple "foo:Anna foo:likesToEat foo:arsenic" is invalid because the range of the objectProperty is not respected.

My questions are:

Is there a way of validating these kind of updates, so that the update operation will execute only if the ontology is respected? Is there way for setting the triple store to validate these things or it has to be done manually?


Solution

  • As you can see the triple "foo:Anna foo:likesToEat foo:arsenic" is invalid because the range of the objectProperty is not respected.

    This is not how (RDF(S)) ontologies work. From the perspective of the ontology, that triple is perfectly valid. The fact that the range of foo:likesToEat is defined to be the class foo:Food just means that we can infer that foo:arsenic is of type foo:Food. There's nothing in your ontology that makes that invalid or inconsistent: after all you've said nowhere that something cannot be both a Food and a Metal.

    More generally speaking: domain/range statements in RDF Schema are not about "closing" what a property can be used on. The semantics of RDF work the other way around: a domain/range restriction on a property P specifies that if a certain individual X uses property P, we can infer that X belongs to the domain/range class of P.

    There is no built-in functionality in Sesame to perform the kind of validation you are asking for, mostly for this reason.

    However, if you really wanted to, you could of course implement something that rejects or warns when a triple is being inserted that you consider invalid (for whatever reason). Depending on your use case you have several options:

    1. implement a Sail(Connection)Wrapper or a Repository(Connection)Wrapper to intercept insert operations and do the necessary validation.
    2. implement an RDFHandler (e.g. a subclass of RDFInserter) that does the validation, and use that handler to add/validate data (instead of using the standard RepositoryConnection.add methods directly).

    Either approach allows you to inspect every incoming triple, do a quick lookup in the database for its predicate, check if there are domain/range restrictions on it, and if the triple "violates" that restriction throw an error. The second approach is probably easiest to do, and also most flexible: you can employ this validation in some use cases in your code, and can skip it completely in places where you know it isn't necessary (because obviously, this kind of validation will come with a performance penalty).