Search code examples
elasticsearchelasticsearch-6

elasticsearch - decay documents using property value


My documents are made of categories. There are 40 different categories these are added to the document manually in database and indexed. This is how my document looks like:

{
  "name": "..",
  "categoryA": "..",
  "categoryB": "..",..
  "categoryDecayScore": 0.0 - 1.0
}

The documents are considered well covered if they are part of all 40 categories. So to push documents in all categories to the top I wanted to use the decay function to reduce the score of those who are part of less categories.

For this I use the categoryDecayScore property which is set at index time. If document is part of all 40 categories than it's categoryDecayScore will be 0.0 if it's missing half but has more than a 1/3 it will get a score of 0.2 and if it has less than 1/3 it will get a score of 0.3.

Then I also increase categoryDecayScore by 0.02 for less relavant scores.

What I want to do:
I would like documents who have categoryDecayScore > 0.0 to have their score decayed the farther they are from 0.0.

This is my filter function:

"filter": {
        "exp": {
          "categoryDecayScore" : {
            "origin" : 0.0,
            "scale" : 1.0,
            "offset" : 0.0,
            "decay" : 0.5
          }
        }
}

The way I understand documentation here:

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

Is that origin is my point of reference and all documents who have categoryDecayScore > 0.0 will be decayed and any with categoryDecayScore >= 1.0 will be decayed by 0.5.

However looking at my results it seems this does not take affect. The top 4 documents all have the same score but here are the categoryDecayScore values:

{
  _score: 51.970146,
  categoryDecayScore: 0.04
},
{
  _score: 51.970146,
  categoryDecayScore: 0.2
},
{
  _score: 51.970146,
  categoryDecayScore: 0.02
},
{
  _score: 51.970146,
  categoryDecayScore: 0.3
}

Is this normal behaviour or am I understanding the decay function incorrectly. My assumption based on docs is:

  • origin: point of reference from which distance is calculated
  • scale: upper point after which all documents are decayed by value of decay param
  • offset: point after which documents are decayed
  • decay: decay amount for all documents scored above or at scale value

Note 1:

Using explain flag I noticed with those exp settings the evaluated decay score is always 1. So the 51.. score is just the text matching score.


Solution

  • My query is/was correct. The issue was that my range 0.0 - 1.0 was to small. So I decided on using whole integers instead of decimals and range from 0 to 1000. For the exclusion I ten set the origin to 100 instead of 0. This returned the expected result.