Search code examples
solrlucenestemming

Stemming demonyms in Solr (Russian => Russia)


Trying to match queries containing "russia" or "russian" to "Russian Federation" using Solr (as well as other country demonyms, such as "american", "syrian" etc).

What is a good way to handle this without adding synonyms for each country, and without doing much stemming on other words?


Solution

  • Turns out stemming was the right approach, but the Porter stemmer was too aggressive for some terms.

    The KStemFilterFactory is less aggressive and worked well.