Suppose we have a documents of people with their name and array of aliases like this:
{
name: "Christian",
aliases: ["נוצרי", "کریستیان" ]
}
Suppose I have a document with 10 aliases and another one with 2 aliases
but both of them contains alias with value کریستیان
.
The length of field (dl)
for the first document is bigger than the second document
so the term frequency (tf)
of the first document gets lower than the second one. eventually the score of the document with less aliases is bigger than another.
Sometimes I want to add more aliases for person in different languages and different forms because he/she is more famous but it causes to get lower score in results. I want to somehow take length of the aliases field
out of my query's calculation.
Norms store the relative length of the field.
How long is the field? The shorter the field, the higher the weight. If a term appears in a short field, such as a title field, it is more likely that the content of that field is about the term than if the same term appears in a much bigger body field.
Norms can be disabled using PUT mapping api
PUT my_index/_mapping
{
"properties": {
"title": {
"type": "text",
"norms": false
}
}
}
Links for further study