I'm trying to boost the relevance based on the count of the field value. The less count of the field value, the more relevant.
For example, I have 1001 documents. 1000 documents are written by John, and only one is written by Joe.
// 1000 documents by John
{"title": "abc 1", "author": "John"}
{"title": "abc 2", "author": "John"}
// ...
{"title": "abc 1000", "author": "John"}
// 1 document by Joe
{"title": "abc 1", "author": "Joe"}
I'll get 1001 documents when I search "abc" against title field. These documents should have pretty similar relevance score if they are not exact same. The count of field value "John" is 1000 and the count of field value "Joe" is 1. Now, I'd like to boost the relevance of the document {"title": "abc 1", "author": "Joe"}
, otherwise, it would be really hard to see the document with the author Joe.
Thank you!
In case someone runs into the same use case, I'll explain my workaround by using Function Score Query. This way would make at least two calls to Elasticsearch server.
1 + sqrt(1/1000)
for John and 1 + sqrt(1/1)
for Joe. Use the weight in the script to calculate the score according to the author value(The script can be much better):
{
"query": {
"function_score": {
"query": {
"match": { "title": "abc" }
},
"script_score" : {
"script" : {
"inline": "if (doc['author'].value == 'John') {return (1 + sqrt(1/1000)) * _score}\n return (1 + sqrt(1/1)) * _score;"
}
}
}
}
}