Search code examples
rscalasparqljenablazegraph

How does a triplestore decide whether to add "background" triples?


I use a few different triplestores, and code in R and Scala. I think I'm seeing some differences in:

  • whether the triplestores include triples other than the ones I explicitly loaded.
  • the point at which these "background" triples might be added.

Are there any general rules for whether supporting vocabularies need to be added, independent of the implementation technology?

Using Jena in R, via rrdf, I usually only see what I loaded:

library(rrdf)
turtle.input.string <-
  "PREFIX prefix:  <http://example.com/>
   prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
   prefix:subject rdf:type prefix:object"
jena.model <-
  fromString.rdf(rdfContent = turtle.input.string, format = "TURTLE")
model.string <- asString.rdf(jena.model, format = "TURTLE")
cat(model.string)

This gives:

@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix prefix: <http://example.com/> .
prefix:subject  a  prefix:object .

But sometimes triples from RDF and RDFS seems to appear when I add or remove triples afterwards. That's what "bothers" me the most, but I'm having trouble finding an example right now. If nobody knows what I mean, I'll dig something up later today.

When I use Blazegraph in Scala, via the OpenRDF Sesame library, I think I always get RDF, RDFS, and OWL "for free"

import java.util.Properties
import org.openrdf.query.QueryLanguage
import org.openrdf.rio._
import com.bigdata.journal._
import com.bigdata.rdf.sail._
object InjectionTest {
  val jnl_fn = "sparql_tests.jnl"
  def main(args: Array[String]): Unit = {
    val props = new Properties()
    props.put(Options.BUFFER_MODE, BufferMode.DiskRW)
    props.put(Options.FILE, jnl_fn)
    val sail = new BigdataSail(props)
    val repo = new BigdataSailRepository(sail)
    repo.initialize()
    val cxn = repo.getConnection()
    val resultStream = new java.io.ByteArrayOutputStream
    val resultWriter = Rio.createWriter(RDFFormat.TURTLE, resultStream)
    val ConstructString = "construct {?s ?p ?o} where {?s ?p ?o}"
    cxn.prepareGraphQuery(QueryLanguage.SPARQL, ConstructString).evaluate(resultWriter)
    var resString = resultStream.toString()
    println(resString)
  }
}

Even without adding any triples, the construct output includes blocks like this:

rdfs:isDefinedBy rdfs:domain rdfs:Resource ;
    rdfs:range rdfs:Resource ;
    rdfs:subPropertyOf rdfs:isDefinedBy , rdfs:seeAlso .

Solution

  • Are there any general rules for whether supporting vocabularies need to be added, independent of the implementation technology?

    That depends on what inferencing scheme your triplestore claims to support. For a pure RDF store (no inferencing), no additional triples should be added at all.

    Judging from that fragment you showed, the Blazegraph store you used has at least RDFS inferencing (and possibly partial OWL reasoning as well?) enabled. Note that this is store-specific, not framework, so it's not a Jena vs. Sesame thing: both frameworks support stores that either do or do not do reasoning. Of course, if you use either framework and use the "excluded inferred triples" option that they offer, the backing store should respect that config option and not include such inferred triples in the result.