I am currently indexing the element
field as "element" : "dog,cat,mouse"
with the following configuration:
ES config:
"settings": {
"analysis": {
"analyzer": {
"search_synonyms": {
"tokenizer": "whitespace",
"filter": [
"graph_synonyms",
"lowercase",
"asciifolding"
],
},
"comma" : {
"type" : "custom",
"tokenizer" : "comma"
}
},
"filter": {
"graph_synonyms": {
...
}
},
"normalizer": {
"normalizer_1": {
...
}
},
"tokenizer" : {
"comma" : {
"type" : "pattern",
"pattern" : ","
}
},
}
},
Fields mapping:
"mappings": {
"properties": {
"element": {
"type": "keyword",
"normalizer": "normalizer_1"
}
................
}
}
The value dog,cat,mouse
is used afterwards as a filter but I want to separate them and use each value as a separated filter. I tried to use multi-fields feature and made the following changes but I'm still not sure what else should I do.
"element": {
"type": "keyword",
"normalizer": "normalizer_1",
"fields": {
"separated": {
"type": "text",
"analyzer": "comma"
}
}
},
If I understand correctly, you have a field where you are storing the value as dog,cat,mouse
and you need them separately like dog
, cat
and mouse
for that you can simply use the text field to store them which uses default standard analyzer, which split tokens on comma ,
.
analyze API to show the tokens
{
"text": "dog,cat,mouse",
"analyzer": "standard"
}
tokens generated
{
"tokens": [
{
"token": "dog",
"start_offset": 0,
"end_offset": 3,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "cat",
"start_offset": 4,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "mouse",
"start_offset": 8,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 2
}
]
}
As per comment, Adding a sample on how to define element
field so that standard
analyzer is used, note currently its defined as keyword
with normalizer, hence standard
analyzer is not used.
Index mapping
PUT /your-index/
{
"mappings": {
"properties": {
"name": {
"element": "text"
}
}
}
}