elasticsearch elasticsearch-dsl elasticsearch-query elasticsearch-mapping

How can I get auto-suggestions for synonyms match in elasticsearch

I'm using the code below and it does not give auto-suggestion as curd when i type "cu"

But it does match the document with yogurt which is correct. How can I get both auto-complete for synonym words and document match for the same?

PUT products
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "synonym_analyzer": {
            "tokenizer": "standard",
            "filter": [
            "lowercase",
              "synonym_graph"
            ]
          }
        },
        "filter": {
          "synonym_graph": {
            "type": "synonym_graph",
            "synonyms": [
               "yogurt, curd, dahi"
            ]
          }
        }
      }
    }
  }
}

PUT products/_mapping
{
  "properties": {
    "description": {
      "type": "text",
      "analyzer": "synonym_analyzer"
    }
  }
}

POST products/_doc
{
  "description": "yogurt"
}

GET products/_search
{
  "query": {
    "match": {
      "description": "cu"
    }
  }
}

Solution

When you provide a list of synonyms in a synonym_graph filter it simply means that ES will treat any of the synonyms interchangeably. But when they're analyzed via the standard analyzer, only full-word tokens will be produced:

POST products/_analyze?filter_path=tokens.token
{
  "text": "yogurt",
  "field": "description"
}

yielding:

{
  "tokens" : [
    {
      "token" : "curd"
    },
    {
      "token" : "dahi"
    },
    {
      "token" : "yogurt"
    }
  ]
}

As such, a regular match_query won't cut it here because the standard analyzer hasn't provided it with enough context in terms of matchable substrings (n-grams).

In the meantime you can replace match with match_phrase_prefix which does exactly what you're after -- match an ordered sequence of characters while taking into account the synonyms:

GET products/_search
{
  "query": {
    "match_phrase_prefix": {
      "description": "cu"
    }
  }
}

But that, as the query name suggests, is only going to work for prefixes. If you fancy an autocomplete that suggests terms regardless of where the substring matches occur, have a look at my other answer where I talk about leveraging n-grams.