I can't see a difference between the results of the following queries #1 and #2. The results are the same, so why do we have 2 separate queries? What's the difference between dismax and should?
Query 1:
GET products/_search
{
"query": {
"dis_max": {
"queries": [
{
"range": {
"price": {
"gte": 800,
"lte": 1000
}
}
},
{
"match_phrase_prefix": {
"overview": "4k ultra hd"
}
}
]
}
}
}
Query 2:
GET products/_search
{
"query": {
"bool": {
"should": [
{
"range": {
"price": {
"gte": 800,
"lte": 1000
}
}
},
{
"match_phrase_prefix": {
"overview": "4k ultra hd"
}
}
]
}
}
}
These two queries calculate the score for each hit differently. Scores of individual subqueries in should
are added together, while in dismax
only the highest score is kept.
The range
query will give you the score of 1 if price is in the range and score of 0 if price is not in the range. The match_phrase_prefix
query will return different score depending on how they match or 0 if it doesn't match. You can see it on the following example:
DELETE test
PUT test/_bulk
{"index": {}}
{"overview": "4k ultra hd tv", "price": 820}
{"index": {}}
{"overview": "hd tv", "price": 810}
{"index": {}}
{"overview": "4k ultra hd ultra cheap tv", "price": 120}
POST test/_search?explain=false
{
"query": {
"dis_max": {
"queries": [
{
"range": {
"price": {
"gte": 800,
"lte": 1000
}
}
},
{
"match_phrase_prefix": {
"overview": "4k ultra hd"
}
}
]
}
}
}
POST test/_search?explain=false
{
"query": {
"bool": {
"should": [
{
"range": {
"price": {
"gte": 800,
"lte": 1000
}
}
},
{
"match_phrase_prefix": {
"overview": "4k ultra hd"
}
}
]
}
}
}
If you run this you will see that in case of should
the first record, which matches both range (with score 1) and query (with score 1.0735388) will return you total score of 1 + 1.0735388 = 2.0735388
. If you dismax
on the same data the first record will have score of max(1, 1.0735388) = 1.0735388
. The rest of the records only match a single query, in which case the total score will be the same in both cases.