Search code examples
solrsearch-enginetokenizen-gramfilterfactory

Difference between NGramFilterFactory and EdgeNGramFilterFactory


I am a beginner in Solr. In my project, NGramFilterFactory and EdgeNGramFilterFactory, both are being used for a field. My understanding as per the document is EdgeNGramFilterFactory is used for "starts with" query while NGramFilterFactory is suitable for "contains" query.

I indexed a small dataset for both combinations (one in which I used only NGramFilterFactory and in another I used both NGramFilterFactory and EdgeNGramFilterFactory) but I did not see any difference in the output.

If my understanding is correct, in a way EdgeNGramFilterFactory is a subset of NGramFilterFactory. If this is true then is there any benefit of using both types of filters on the same field?


Solution

  • You should not be using both filters on the same field, they will completely mess up your matching. If you need to match in a middle of a token, you use NGrams. If you only need to match from the start, you use EdgeNGrams. Never both together.