I am using the following settings and mappings for elasticSearch
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 10
},
"synonym_filter": {
"type": "synonym",
"synonyms":[
"yoga,fit-sports,blue",
"tshirt,tees,t-shirt "
]
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"synonym_filter",
"autocomplete_filter"
]
}
}
}
},
"mappings": {
"products": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
}
And I indexed a field "name:Princess Print T-shirt".
As am using whitespace analyzer, es create token like "t-shirt". but for searching i am using "search_analyzer": "standard" the query i think is going like "princess print t shirt" and this "t shirt" will not match thus will give empty search result. One solution from my side is like adding synonym like "t shirt,t-shirt". Then i will get the result. but in this case if we search for "shirt" it will return both "t-shirt and shirt" which is not acceptable. And if i didn't use this "search_analyzer": "standard" i am not getting expected result. If i search for "t-shirt" i need only tat search result
The problematic part is as you already described "search_analyzer": "standard"
.
This will transform every entry of T-shirt
to the tokens t
and shirt
.
The data in your index looks like t-shirt
, t-shir
and so on and does not match.
You need to make sure that the query is lowercased, splitted at whitespace.
So you could define also a custom anlyzer for query time using the whitespace analyzer
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-whitespace-analyzer.html combined with a lowercase-analyzer.