Search code examples
elasticsearchnestn-gramelasticsearch-analyzers

elasticsearch : Avoid repetitive scoring when using ngram analyzer


Suppose I search for "hello" when the document contains "hello" and "hello hello" I want "hello" to have higher scoring.

I am using ngram index and search analyzer. (Because I really need this for other scenarios) So "hello hello" gets matched twice and hence shows as the top result. Is there any way I can avoid this? I have already tried term query, match phrase query, multi match queries all of them scores "hello hello" higher.


Solution

  • I solved this by adding a duplicate unanalyzed (keyword) column for the document and used bool clause to boost the term query.

    var res = client.Search<MyClass>(s => s
      .Query(q => q
        .Bool(
            b1 => b1.Should(
                s1 =>s1
                .Term(m=>m
                    .Field(f => f._DUPLICATE_COLUMN)
                    .Value("hello")
                    .Boost(1)
                ),
    
                s1=>s1.Match(m => m
                .Field(f => f.MY_COLUMN)
                .Query("hello")
                .Analyzer("myNgramSearchAnalyzer")
                )
            )
            .MinimumShouldMatch(1)
        )
      )
    );