With TermQuery and PhraseQuery my SerbianAnalyzer gets called but not with FuzzyQuery. I tried using lucene 4 and lucene 7 both with same behavior. I have following code:
Query query;
String field = "text";
String value = "дањ";
QueryParser queryParser = new QueryParser(field, new SerbianAnalyzer());
System.out.println("\nTermQuery");
query = new TermQuery(new Term(field, value));
System.out.println("Query (preParse): " + (TermQuery)query);
System.out.println("Query.toString(field1): " + ((TermQuery)query).toString(field));
System.out.println("Query (afterParse): " + queryParser.parse(((TermQuery)query).toString(field)));
System.out.println("\nPhraseQuery");
String[] terms = value.split(" ");
query = new PhraseQuery(field, terms);
System.out.println("Query (preParse): " + ((PhraseQuery)query));
System.out.println("Query.toString(field1): " + ((PhraseQuery)query).toString(field));
System.out.println("Query (afterParse): " + queryParser.parse(((PhraseQuery)query).toString(field)));
System.out.println("\nFuzzyQuery");
query = new FuzzyQuery(new Term(field, value), 1);
System.out.println("Query (preParse): " + ((FuzzyQuery)query));
System.out.println("Query.toString(field1): " + ((FuzzyQuery)query).toString(field));
System.out.println("Query (afterParse): " + queryParser.parse(((FuzzyQuery)query).toString(field)));
Result I'm getting is:
TermQuery Query (preParse): text:дањ
Query.toString(field): дањ
Query (afterParse): text:danj
PhraseQuery Query (preParse): text:"дањ"
Query.toString(field): "дањ"
Query (afterParse): text:danj
FuzzyQuery Query (preParse): text:дањ~1
Query.toString(field): дањ~1
Query (afterParse): text:дањ~1
The problem is, that for a long time QueryParser wasn’t parsing queries properly (didn’t apply analyzers) if they were FuzzyQuery, WildcardQuery, PrefixQuery, RegexpQuery
To solve this problem, Lucene had AnalyzingQueryParser class, that overrides Lucene's default QueryParser so that Fuzzy-, Prefix-, Range-, and WildcardQuerys are also passed through the given analyzer, but wildcard characters * and ? don't get removed from the search terms.
However, starting from Lucene 7.4 this functionality was merged into QueryParserBase which have now appropriate methods to handle those queries, like:
protected Query getFuzzyQuery(String field,
String termStr,
float minSimilarity)
So, instead of creating class QueryParser
, you should create ComplexPhraseQueryParser
that overriding this method and call parse from here.