Search code examples
c#elasticsearchnest

Ignore text length on score calculation Elasticsearch


I am working with ES 5. Using Nest 5 on C#. I have an index for a Person in my ES solution. This person has a First Name, Last Name fields between others.

For these fields I am using a "Whitespace" tokenizer with "trim" and "lowercase" token filters. I implemented a search in which I look by these 2 fields. The problem I am having is with the score calculation. This is just an example to illustrate the problem:

If I search by "Lucas Gonzales" And I have 2 documents where

Document 1:
FirstName = "Lucas"
LastName = "Perez"

Document 2:
FirstName = "Lucas Juan Jose"
LastName = "Gonzales de Perez Almeida",

Even when document 2 has the 2 terms (Lucas and Gonzales) the first one is return first. When I see the query with the explanation on kibana I notice that for document 2, the score is lower because the text length is greater.

What I want to do is to get first the document that has more terms matches, regardless of the text length (or any other criteria). So for this example the second document will have 2 terms that match, "Lucas" and "Gonzales", so it should be return first.

Is there a good way to do this? taking into account that my person model has around 20 properties and the query needs to run fast.


Solution

  • You should disable norms for your fields in order to eliminate the impact of fields length. Here is an example of the mapping:

    PUT my_index/_mapping/my_type
    {
        "properties": {
            "firstName": {
                "type": "text",
                "norms": {
                    "enabled": false
                }
            },
            "lastName": {
                "type": "text",
                "norms": {
                    "enabled": false
                }
            }
        }
    }
    

    or, in case of NEST client, here is an attribute mapping

    class Sample
    {
        [Text(Norms = false)]
        public string FirstName { get; set; }
    
        [Text(Norms = false)]
        public string LastName { get; set; }
    }
    
    client.CreateIndex(indexName,
        create => create.Mappings(
            mappings => mappings.Map<Sample>(
                map => map.AutoMap()
            )
        )
    );