Search code examples
lucenegraphdb

GraphDB + Lucene Index: can I get the matched predicate / literal?


Following the instructions i set up an index that covers multiple (literal-)predicates:

  PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
  INSERT DATA {
    luc:index             luc:setParam "uris" .
    luc:include           luc:setParam "literals" .
    luc:moleculeSize      luc:setParam "1" .
    luc:includePredicates luc:setParam "http://purl.org/dc/terms/title http://www.w3.org/2000/01/rdf-schema#label http://www.w3.org/2004/02/skos/core#prefLabel http://www.w3.org/2004/02/skos/core#altLabel" .
  }

and

PREFIX luc: <http://www.ontotext.com/owlim/lucene#>
INSERT DATA {
  luc:${Cfg.literalIndex}   luc:createIndex   "true" .
}

This part seems to work just fine. My question now is, is there some way to get the matched predicate or literal in my SPARQL query?

So assume the following data:

:exA rdfs:label     'label' ;
     dct:title      'title' .

I'd like to do something like this

SELECT *
WHERE {
  ?needle luc:labelIndex "title" ;
          luc:predicate  ?predicate ;
          ?predicate     ?label .
}

If something like this luc:predicate exists, this could give me the actually matched predicate alongside the matches value. However, I'm not even sure Lucene indexes the predicate, which would be needed to enable such a function.


Solution

  • You can't do this efficiently with the legacy FTS Lucene plugin. However, the Lucene Connectors easily supports your use case. Here is a sample case with some mock data:

    Sample data

    <urn:a> a <http://www.w3.org/2004/02/skos/core#Concept> ;
        <http://purl.org/dc/terms/title> "title"; 
        <http://www.w3.org/2000/01/rdf-schema#label> "label" ; 
        <http://www.w3.org/2004/02/skos/core#prefLabel> "prefer label"; 
        <http://www.w3.org/2004/02/skos/core#altLabel> "alt label" .
    

    Note: Connectors index the data for a single rdf:type. In your example, I believe you should have skos:Concept.

    Create Lucene Connector

    Connectors will index for the selected type every property or a property chain into a separate Lucene field.

    PREFIX : <http://www.ontotext.com/connectors/lucene#>
    PREFIX inst: <http://www.ontotext.com/connectors/lucene/instance#>
    
    INSERT DATA {
        inst:fts :createConnector '''
    {
      "types": [
        "http://www.w3.org/2004/02/skos/core#Concept"
      ],
      "fields": [
        {
          "fieldName": "label",
          "propertyChain": [
            "http://www.w3.org/2000/01/rdf-schema#label"
          ]
        },
        {
          "fieldName": "prefLabel",
          "propertyChain": [
            "http://www.w3.org/2004/02/skos/core#prefLabel"
          ]
        },
        {
          "fieldName": "altLabel",
          "propertyChain": [
            "http://www.w3.org/2004/02/skos/core#altLabel"
          ]
        }
      ]
    }
    ''' .
    }
    

    Return the matching fields and snippet

    PREFIX : <http://www.ontotext.com/connectors/lucene#>
    PREFIX inst: <http://www.ontotext.com/connectors/lucene/instance#>
    SELECT ?entity ?snippetField ?snippetText {
        ?search a inst:fts ;
                :query "label" ;
                :entities ?entity .
        ?entity :snippets _:s .
        _:s :snippetField ?snippetField ;
            :snippetText ?snippetText .
    }
    

    Where in the projection:

    • ?entities is the RDF resource for which a property or a property chain matched i.e.
    • ?snippetField is the field name matching the full-text query
    • ?snippetText is the matched snippet value