I am using elasticsearch 7.0.0
.
I am trying to work on synonyms
with this configuration while creating index
.
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"synonym": {
"tokenizer": "whitespace",
"filter": [
"synonym"
]
}
},
"filter": {
"synonym": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
}
}
}
},
"mappings": {
"properties": {
"address.state": {
"type": "text",
"analyzer": "synonym"
},
"location": {
"type": "geo_point"
}
}
}
}
Here's a document inserted into the index:
{
"name": "Berry's Burritos",
"description": "Best burritos in New York",
"address": {
"street": "230 W 4th St",
"city": "New York",
"state": "NY",
"zip": "10014"
},
"location": [
40.7543385,
-73.976313
],
"tags": [
"mexican",
"tacos",
"burritos"
],
"rating": "4.3"
}
Also content in synonyms.txt
:
ny, new york, big apple
When I tried searching for anything in address.state
property, I get empty
result.
Here's the query:
{
"query": {
"bool": {
"filter": {
"range": {
"rating": {
"gte": 4
}
}
},
"must": {
"match": {
"address.state": "ny"
}
}
}
}
}
Even with ny
(as it is:no synonym) in query, the result is empty.
Before, when I created index without mappings
, the query used to give the result, only except for synonyms.
But now with mappings
, the result is empty even though the term is present.
This query is working though: { "query": { "query_string": { "query": "tacos", "fields": [ "tags" ] } } }
I looked and researched into many articles/tutorials and came up this far.
What am I missing here now?
While indexing you are passing the value as "state":"NY"
. Notice the case of NY
. The analyzer synonym
define in the settings has only one filter i.e. synonym
. NY
doesn't match any set of synonyms in defined in synonym.txt due to case. NOTE that NY
isn't equal to ny
. To overcome this problem (or we can call making it case insensitive) add lowercase
filter before synonym
filter to synonym
analyzer. This will ensure that any input text is lower cased first and then synonym filter is applied. Same will happen when you search on that field using full text search queries.
So you settings will be as below:
"settings": {
"index": {
"analysis": {
"analyzer": {
"synonym": {
"tokenizer": "whitespace",
"filter": [
"lowercase",
"synonym"
]
}
},
"filter": {
"synonym": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
}
}
}
}
No changes are required in mapping.
Answer to this is because when you haven't defined any mapping, elastic would map address.state
as a text
field with no explicit analyzer defined for the field. In such case elasticsearch by default uses standard analyzer which uses lowercase token filter as one of the filters. and hence the query matched the document.