Search code examples
magentoelasticsearchsearchmagento2

Searching for T-shirt not returning t-shirt in elasticsearch


I am using the following settings and mappings for elasticSearch

{
  "settings": {
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 10
        },
        "synonym_filter": {
          "type": "synonym",
          "synonyms":[
            "yoga,fit-sports,blue",
            "tshirt,tees,t-shirt "
          ]
        }
      },
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "synonym_filter",
            "autocomplete_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "products": {
      "properties": {
        "name": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          },
          "analyzer": "autocomplete",
          "search_analyzer": "standard"
        }
      }
    }
  }
}

And I indexed a field "name:Princess Print T-shirt".

As am using whitespace analyzer, es create token like "t-shirt". but for searching i am using "search_analyzer": "standard" the query i think is going like "princess print t shirt" and this "t shirt" will not match thus will give empty search result. One solution from my side is like adding synonym like "t shirt,t-shirt". Then i will get the result. but in this case if we search for "shirt" it will return both "t-shirt and shirt" which is not acceptable. And if i didn't use this "search_analyzer": "standard" i am not getting expected result. If i search for "t-shirt" i need only tat search result


Solution

  • Problem description

    The problematic part is as you already described "search_analyzer": "standard".

    This will transform every entry of T-shirt to the tokens t and shirt. The data in your index looks like t-shirt, t-shir and so on and does not match.

    Possible solution

    Adapt search analyzer

    You need to make sure that the query is lowercased, splitted at whitespace. So you could define also a custom anlyzer for query time using the whitespace analyzer https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-whitespace-analyzer.html combined with a lowercase-analyzer.