Search code examples
elasticsearchelasticsearch-dslelasticsearch-query

ElasticSearch: [multi_match] query does not support [search_analyzer]


In ElasticSearch 7.x, I've indexed the data fields with an analyzer that has a synonym filter. However, to support boosting the queries that "exactly" match query terms in the data fields over the ones matched with their synonyms in the data, I'm going to use search_analyzer.

To this end, for the query that I want to match exactly, I want to provide an analyzer that does not have a synonym filter in it. This can be done by the search_analyzer. However, my major query is a multi_match query to search these terms on all the desired fields (and with different importance (boosting)).

It seems that ElasticSearch does not allow search_analyzer in the multi-match query. What are the alternatives? either for my high-level solution (to boost exact words over their synonyms) or to incorporate search_analyzer when I'm searching in different fields with different boosting (importance weight).

PS: I do not want to re-index the data fields one with a synonym analyzer and another without.


Solution

  • Search_analyzer is the param for index time, so if you want to set it for the field with synonym:

    {
        "settings": {
            "index" : {
                "analysis" : {
                    "analyzer" : {
                        "synonym" : {
                            "tokenizer" : "whitespace",
                            "filter" : ["synonym"]
                        }
                    },
                    "filter" : {
                        "synonym" : {
                            "type" : "synonym",
                            "synonyms_path" : "analysis/synonym.txt"
                        }
                    }
                }
            }
        }, "mappings" : {
          "properties" : {
            "description" : {
              "type" : "text",
              "analyzer": "synonym",
              "search_analyzer": "standard"
            },
            "content" : {
              "type" : "text",
              "analyzer": "synonym",
              "search_analyzer": "standard",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
    }
    

    In this way you have set the default analyzer for query time. So you now could perform a multimatch query in this way:

    {
      "query": {
        "multi_match" : {
          "query":      "bread cereal",
          "type":       "cross_fields",
          "fields": [
            "description",
            "content"
          ],
          "operator":   "and" 
        }
      }
    }
    

    If you haven't set a specific search_analyzer at index time for those field the same analyzer used to indicize is used at query time. If you haven't set a search_analyzer on index time, you could force to use a specific analyzer at query time putting analyzer param in query:

    {
      "query": {
        "multi_match" : {
          "query":      "bread cereal",
          "analyzer" : "standard",
          "type":       "cross_fields",
          "fields": [
            "description",
            "content"
          ],
          "operator":   "and" 
        }
      }
    }