GraphDB supports FTS Lucene plugin to build RDF 'molecule' to index texts efficiently. However, when there is a typo (missspell) in the word your are searching, Lucene would not retrieve a result. I wonder if it is possible to implement a FuzzyQuery based on the Damerau-Levenshtein algorithm on top the Lucene Index in GraphDB for FTS. That way even if the word is not correctly spell you can get a list of more 'closed' words based on an edit distance similarity.
This is the index I have created for indexing labels of NounSynset in WordNet RDF.
PREFIX wn20schema: <http://www.w3.org/2006/03/wn/wn20/schema/>
INSERT DATA {
luc:index luc:setParam "uris" .
luc:include luc:setParam "literals" .
luc:moleculeSize luc:setParam "1" .
luc:includePredicates luc:setParam "http://www.w3.org/2000/01/rdf-schema#label" .
luc:includeEntities luc:setParam wn20schema:NounSynset.
luc:nounIndex luc:createIndex "true".
}
When running the query
select * where {
{?id luc:nounIndex "credict"}
?id luc:score ?score.
}
The result is empty and I would like to get at least the word "credit" as the edit distance is 1.
Thank you!!!
If you use the ~
it should give you a fuzzy match.