Search code examples
lucenesimilarity

Lucene: Setting minimum required similarity on searches


I'm having a lot of trouble dealing with Lucene's similarity factor. I want it to apply a similarity factor different than its default (which is 0.5 according to documentation), but it doesn't seem to be working.

When I type a query that explicitly sets the required similarity factor, like [tinberland~0.5] (notice that I wrote tiNberland, with an "N", while the correct would be with an "M"), it brings many products by the Timberland manufacturer. But when I just type [tinberland] (no similarity factor explicitly defined) and try to set the similarity via code, it doesn't work (returns no results).

The code I wrote to set the similarity is like:

multiFieldQueryParser.SetFuzzyMinSim(0.5F);

And I didn't change the Similarity algorithm, so it is using the DefaultSimilarity class.

Isn't that the correct or recommended way of applying similarity via code? Is there a specific QueryParser for fuzzy queries?

Any help is highly appreciated. Thanks in advance!


Solution

  • What you are setting is the minimal similarity, so e.g. if someone searched for foo~.1 the parser would change it to foo~.5. It's not saying "turn every query into a fuzzy query."

    You can use MultiFieldQueryParser.getFuzzyQuery like so:

    Query q = parser.getFuzzyQuery(field, term, minSimilarity);
    

    but that will of course require you calling getFuzzyQuery for each field. I'm not aware of a "MultiFieldFuzzyQueryParser" class, but all it would do is just combine a bunch of those getFuzzyQuery calls.