I have created an index whose mapping looks like this:
{
"corona_data_search_ac_poc" : {
"mappings" : {
"properties" : {
"Country" : {
"type" : "text"
},
"Date" : {
"type" : "text"
},
"IsImplicitIntent" : {
"type" : "boolean"
},
"PopularityScore" : {
"type" : "long"
},
"Query" : {
"type" : "text",
"fields" : {
"query_suggest" : {
"type" : "completion",
"analyzer" : "simple",
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50
}
}
}
}
}
}
}
Sample document looks like this:
{"Date": "01-01-2020", "Query": "coronavirus is deadly", "IsImplicitIntent": true, "Country": "United States", "PopularityScore": 1}
I am using Query.query_suggest
for autocompletion. For this purpose, I generate a list of suffixes of the Query
field using a script in the ingest pipeline.
So for example, if "Query": "coronavirus is deadly"
, then
"Query.query_suggest"
should look like this:
"query_suggest" : {
"input" : [
"coronavirus is deadly",
"is deadly",
"deadly"
]
}
Now, I am able to generate the suffix list using the below script:
{
"script": {
"source": """
def tokens = new ArrayList(Arrays.asList(/\s+/.split(ctx.Query)));
def nbTokens = tokens.size();
def input = [];
for (def i = nbTokens; i > 0; i--) {
input.add(tokens.join(" "));
tokens.remove(0);
}
# how to assign the list to the inner field?
ctx.Query.query_suggest = [
'input': input
]
"""
}
}
I am not sure how should I assign the list to the inner field (see the comment in above script) so that ES can build autocomplete graph on top of that data?
NOTE: If I define query_suggest
at the same level as Query
in the mappings and then assigning values like ctx.query_suggest
in the script, then it works fine.
The mapping of your Query
field is not correct. You won't be able to index what you expect because the query field is of type text
and won't accept the same input as required by the suggest field, they have to be separate fields.
You need to have two separate fields at the top level:
"Query" : {
"type" : "text"
},
"Query_Suggest": {
"type" : "completion",
"analyzer" : "simple",
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50
}
Then indexing this:
PUT test/_doc/1?pipeline=my-pipeline
{
"Query": "coronavirus is deadly"
}
Will yields this:
{
"Query" : "coronavirus is deadly",
"Query_Suggest" : {
"input" : [
"coronavirus is deadly",
"is deadly",
"deadly"
]
}
}