Let's assume we're looking at data that's reasonably simple -- each document in our index has this structure:
{
"Time": "2018-01-01T19:35:00.0000000Z",
"Country": "Germany",
"Addr": "security.web.com",
"FailureCount": 5,
"SuccessCount": 50
}
My question essentially boils down to how I might go about doing something like this: https://www.elastic.co/guide/en/elasticsearch/guide/current/_combining_the_two.html. Specifically, I am trying to perform the same aggregation (query below) on all combinations of Country
and Addr
. My current query attempt is below. I aggregate across a 5-minute grain (that is part of my requirements), and so far I have only been able to aggregate based on one query.
{
"size":0,
"query":{
"bool":{
"filter":[
{
"range":{
"Time":{
"gte":"1514835300000",
"lte":"1514835600000",
"format":"epoch_millis"
}
}
},
{
"query_string":{
"analyze_wildcard":true,
"query":"Country:Germany"
}
}
]
}
},
"aggs":{
"2":{
"date_histogram":{
"interval":"5m",
"field":"Time",
"min_doc_count":0,
"extended_bounds":{
"min":"1514835300000",
"max":"1514835600000"
},
"format":"epoch_millis"
},
"aggs":{
"4":{
"bucket_script":{
"buckets_path":{
"success":"9",
"failure":"10"
},
"script":"( params.success + params.failure )"
}
},
"9":{
"sum":{
"field":"SuccessCount"
}
}
"10":{
"sum":{
"field":"FailureCount"
}
}
}
}
}
This works, but simply aggregates on all documents that match the bool-filter (over 5-minute buckets). Instead, I wanted to aggregate across all combinations of Country
and Addr
(over 5-minute buckets).
That is, I would like an aggregation result/metric (as laid out in the script
in bucket 4
) for all docs that have "Country": "Germany"
and "Addr": "security.web.com"
, one for all docs that have "Country": "United States"
and "Addr": "security.web.com"
, and so on, for all Addr
s and all Country
s. Is this possible in one Elasticsearch request? What might my best option be here?
Follow-up
Is this also possible to do not across all combinations of Addr
s and Country
s, but instead across specific combinations of Addr
s and Country
s (that I might lay out in a query)? Or am I overreaching beyond ES's capabilities within one request?
Thanks!
If you want this in 1 query, you may just try sub aggregating it 4 times.
"aggs": {
"countries": {
"terms": {
"field": country,
"size": 300
},
"aggs": {
"addrs": {
"terms": {
"field": "Addr",
"size": 1000
},
"aggs": {
"2": {
"date_histogram":.....// your original query
}
}
}
}
}
However, I would not recommend doing this on a large amount of data as such deep sub-aggregations would be really slow. If you really need to do this in a single query, create a field which combines country and addr in a single field while indexing and aggregate on it.
If you want specific combinations, just put your combinations inside a filters aggregation and sub-aggregate it with your query.