I have hundreds of chemicals results in my index climate_change
I'm using a ngram research and this is the settings that I'm using for the index.
{
"settings": {
"index.max_ngram_diff": 30,
"index": {
"analysis": {
"analyzer": {
"analyzer": {
"tokenizer": "test_ngram",
"filter": [
"lowercase"
]
},
"search_analyzer": {
"tokenizer": "test_ngram",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"test_ngram": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 30,
"token_chars": [
"letter",
"digit"
]
}
}
}
}
}
}
My main problem is that if I try to do a query like this one
GET climate_change/_search?size=1000
{
"query": {
"match": {
"description": {
"query":"oxygen"
}
}
}
}
I see that a lot of results have the same score 7.381186..but it's strange
{
"_index" : "climate_change",
"_type" : "_doc",
"_id" : "XXX",
"_score" : 7.381186,
"_source" : {
"recordtype" : "chemicals",
"description" : "carbon/oxygen"
}
},
{
"_index" : "climate_change",
"_type" : "_doc",
"_id" : "YYY",
"_score" : 7.381186,
"_source" : {
"recordtype" : "chemicals",
"description" : "oxygen"
}
How could it be possible? In the example above, If I'm using ngram and I'm searching oxygen in the description field, I'll expect that the second result will have a score bigger than the first one. I've also tried to specify the type of the tokenizer "standard" and "whitespace" in the settings, but it could not help. Maybe is the '/' character inside the description?
Thanks a lot!
You need to define the analyzer in the mapping for the description
field also.
Adding a working example with index data, mapping, search query and search result
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "test_ngram",
"filter": [
"lowercase"
]
},
"search_analyzer": {
"tokenizer": "test_ngram",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"test_ngram": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 30,
"token_chars": [
"letter",
"digit"
]
}
}
}
},
"mappings": {
"properties": {
"description": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
Index Data:
{
"recordtype": "chemicals",
"description": "carbon/oxygen"
}
{
"recordtype": "chemicals",
"description": "oxygen"
}
Search Query:
{
"query": {
"match": {
"description": {
"query":"oxygen"
}
}
}
}
Search Result:
"hits": [
{
"_index": "67180160",
"_type": "_doc",
"_id": "2",
"_score": 0.89246297,
"_source": {
"recordtype": "chemicals",
"description": "oxygen"
}
},
{
"_index": "67180160",
"_type": "_doc",
"_id": "1",
"_score": 0.6651374,
"_source": {
"recordtype": "chemicals",
"description": "carbon/oxygen"
}
}
]