Search code examples
elasticsearchspell-checking

Elasticsearch spell check suggestions even if first letter missed


I create an index like this:

curl --location --request PUT 'http://127.0.0.1:9200/test/' \
--header 'Content-Type: application/json' \
--data-raw '{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "properties" : {
            "word" : { "type" : "text" }
        }
    }
}'

when I create a document:

curl --location --request POST 'http://127.0.0.1:9200/test/_doc/' \
--header 'Content-Type: application/json' \
--data-raw '{ "word":"organic" }'

And finally, search with an intentionally misspelled word:

curl --location --request POST 'http://127.0.0.1:9200/test/_search' \
--header 'Content-Type: application/json' \
--data-raw '{
  "suggest": {
    "001" : {
      "text" : "rganic",
      "term" : {
        "field" : "word"
      }
    }
  }
}'

The word 'organic' lost the first letter - ES never gives suggestion options for such a mispell (works absolutely fine for any other misspells - 'orgnic', 'oragnc' and 'organi'). What am I missing?


Solution

  • This is happening because of the prefix_length parameter: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html . It defaults to 1, i.e. at least 1 letter from the beginning of the term has to match. You can set prefix_length to 0 but this will have performance implications. Only your hardware, your setup and your dataset can show you exactly what those will be in practice in your case, i.e. try it :). However, be careful - Elasticsearch and Lucene devs set the default to 1 for a reason.

    Here's a query which for me returns the suggestion result you're after on Elasticsearch 7.4.0 after I perform your setup steps.

    curl --location --request POST 'http://127.0.0.1:9200/test/_search' \
    --header 'Content-Type: application/json' \
    --data-raw '{
      "suggest": {
        "001" : {
          "text" : "rganic",
          "term" : {
            "field" : "word",
            "prefix_length": 0
          }
        }
      }
    }'