We have a type that contains mac address field. the data is brought using jdbc river
The cause is that when we run a term aggregation on the mac_address field the results looks like the field is broke into indexed keys:
Action:
GET index/type/_search?search_type=count
{
"aggs" : {
"uniqe_macs" : {
"terms" : {
"field" : "mac_address"
}
}
}
}
Result:
"aggregations": {
"uniqe_visitors": {
"buckets": [
{
"key": "00",
"doc_count": 1608759
},
{
"key": "10",
"doc_count": 674633
},
{
"key": "18",
"doc_count": 588591
},
{
"key": "f0",
"doc_count": 544897
},
{
"key": "60",
"doc_count": 538841
},
{
"key": "40",
"doc_count": 529085
},
{
"key": "08",
"doc_count": 523681
},
{
"key": "d0",
"doc_count": 515774
},
{
"key": "54",
"doc_count": 514771
},
{
"key": "04",
"doc_count": 509629
}
]
}
}
What can be done to force elastic to map this field and not to break it into keys?
Can you try with following mapping, custom analyzer on es field mac_address
.
Define analyzer
curl -XPUT http://localhost:9200/INDEX -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"my_edge_ngram_analyzer" : {
"tokenizer" : "my_edge_ngram_tokenizer"
}
},
"tokenizer" : {
"my_edge_ngram_tokenizer" : {
"type" : "edgeNGram",
"min_gram" : "2",
"max_gram" : "17"
}
}
}
}
}'
Apply mapping
curl -XPUT http://localhost:9200/INDEX/TYPE/_mapping -d '
{
"TYPE": {
"properties" {
"mac_address": {
"type": "string",
"index_analyzer" : "my_edge_ngram_analyzer",
"search_analyzer": "keyword"
}
}
}
}'