I create an index like this:
curl --location --request PUT 'http://127.0.0.1:9200/test/' \
--header 'Content-Type: application/json' \
--data-raw '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"properties" : {
"word" : { "type" : "text" }
}
}
}'
when I create a document:
curl --location --request POST 'http://127.0.0.1:9200/test/_doc/' \
--header 'Content-Type: application/json' \
--data-raw '{ "word":"organic" }'
And finally, search with an intentionally misspelled word:
curl --location --request POST 'http://127.0.0.1:9200/test/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
"suggest": {
"001" : {
"text" : "rganic",
"term" : {
"field" : "word"
}
}
}
}'
The word 'organic' lost the first letter - ES never gives suggestion options for such a mispell (works absolutely fine for any other misspells - 'orgnic', 'oragnc' and 'organi'). What am I missing?
This is happening because of the prefix_length
parameter: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html . It defaults to 1, i.e. at least 1 letter from the beginning of the term has to match. You can set prefix_length
to 0 but this will have performance implications. Only your hardware, your setup and your dataset can show you exactly what those will be in practice in your case, i.e. try it :). However, be careful - Elasticsearch and Lucene devs set the default to 1 for a reason.
Here's a query which for me returns the suggestion result you're after on Elasticsearch 7.4.0 after I perform your setup steps.
curl --location --request POST 'http://127.0.0.1:9200/test/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
"suggest": {
"001" : {
"text" : "rganic",
"term" : {
"field" : "word",
"prefix_length": 0
}
}
}
}'