I'm currently working with elasticsearch and I'm trying to implement a query from the Java backend that will query documents from my index not only by term but by field priority as well. In my index, I have documents that have a term and a field that specifies a type.
e.g
term: "Flu Shot"
type: "procedure"
term: "Fluphenazine"
type: "drug"
I created a query that will search by term and the elastic index will return the most relevant results matching that term. The functionality I want to create is to create a query to return results matching that same term but ordered by a priority of the 'type' field. For example when I type "flu" I want to get the documents with type: "procedure" first then after them the ones with the type "drug". Currently, the index returns only documents with type "drugs" due to many drugs that start with "flu".
You can use function_score
.
The
function_score
allows you to modify the score of documents that are retrieved by a query. To usefunction_score
, the user has to define a query and one or more functions, that compute a new score for each document returned by the query.
Example your data in question (using Elasticsearch server 7.9):
Create index, and add documents
PUT /example_index
{
"mappings": {
"properties": {
"term": {"type": "text" },
"type": {"type": "keyword"}
}
}
}
PUT /_bulk
{"create": {"_index": "example_index", "_id": 1}}
{"term": "Flu Shot", "type": "procedure"}
{"create": {"_index": "example_index", "_id": 2}}
{"term": "Fluphenazine", "type": "drug"}
{"create": {"_index": "example_index", "_id": 3}}
{"term": "Flu Shot2", "type": "procedure"}
{"create": {"_index": "example_index", "_id": 4}}
{"term": "Fluphenazine2", "type": "drug"}
Query documents using custom scoring logic
GET /example_index/_search
{
"query": {
"function_score": {
"query": {
"wildcard": {
"term": {
"value": "*flu*"
}
}
},
"functions": [
{
"filter": {
"term": {
"type": "procedure"
}
},
"weight": 2
},
{
"filter": {
"term": {
"type": "drug"
}
},
"weight": 1
}
]
}
}
}
Results:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 2.0,
"hits" : [
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.0,
"_source" : {
"term" : "Flu Shot",
"type" : "procedure"
}
},
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 2.0,
"_source" : {
"term" : "Flu Shot2",
"type" : "procedure"
}
},
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"term" : "Fluphenazine",
"type" : "drug"
}
},
{
"_index" : "example_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"term" : "Fluphenazine2",
"type" : "drug"
}
}
]
}
}
You can see the documents with type
set to procedure
have a higher score than the documents with type
set to drug
. This is because we've assigned different weights to the different type
s in the function_score
.