I couldn't find an answer in previous posts so I hope my post is relevant. I am having troubles with ElasticSearch term facets.
When I query the count of documents for every term facet, I get, let's say 8 for some field value but when I query the count of document with that specific value for the field, I get, let's say 19.
To be more recise, I am using Kibana and here are the queries and responses (I was told to rename the field value fyi) :
all term facets count query:
{
"facets" : {
"terms" : {
"terms" : {
**"fields" : ["field.name"],**
"size" : 6,
"order" : "count",
"exclude" : []
},
"facet_filter" : {
"fquery" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [{
"query_string" : {
"query" : "*"
}
}
]
}
},
"filter" : {
"bool" : {
"must" : [{
"match_all" : {}
}
]
}
}
}
}
}
}
}
},
"size" : 0
}
the response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 20374,
"max_score" : 0.0,
"hits" : []
},
"facets" : {
"terms" : {
"_type" : "terms",
"missing" : 10567,
"total" : 9918,
"other" : 9781,
"terms" : [{
"term" : "fieldValue1"
"count" : 43
}, {
"term" : "fieldValue2",
"count" : 27
}, {
"term" : "fieldValue3",
"count" : 23
}, {
"term" : "fieldValue4",
"count" : 23
}, {
"term" : "fieldValue5",
"count" : 13
}, {
"term" : "fieldValue6",
"count" : 8
}
]
}
}
}
the query on "fieldValue6"
{
"facets" : {
"terms" : {
"terms" : {
"fields" : ["field.name"],
"size" : 6,
"order" : "count",
"exclude" : []
},
"facet_filter" : {
"fquery" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"should" : [{
"query_string" : {
"query" : "*"
}
}
]
}
},
"filter" : {
"bool" : {
"must" : [{
"terms" : {
"field.name" : ["fieldValue6"]
}
}
]
}
}
}
}
}
}
}
},
"size"
the response :
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 20374,
"max_score" : 0.0,
"hits" : []
},
"facets" : {
"terms" : {
"_type" : "terms",
"missing" : 0,
"total" : 19,
"other" : 0,
"terms" : [{
"term" : "fieldValue6",
"count" : 19
}
]
}
}
}
the field I apply the facet filter (or whatever it is actually supposed to be called) is set as "not analyzed" :
properties: {
type_ref2Strack: {
properties: {
position: {
type: long
}
name: {
index: not_analyzed
norms: {
enabled: false
}
index_options: docs
type: string
}
}
}
}
This is a long standing known limitation of elasticsearch facets (now called aggregations).
The key problem is that it runs the facet against each shard with given size and then combines the results, meaning counts can get chopped off.
There are two non-ideal ways to handle this:
For more info see here: