I'm trying to extract, for each event_type
, in how many days the event occurred and the last occurrence. Currently I wrote the following query:
{
"size": 0,
"query": { ... },
"_source": false,
"aggregations": {
"events": {
"terms": { "field": "event_type" },
"aggs": {
"event_timestamp": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "day",
"min_doc_count": 1,
"order": { "_key": "desc" }
}
}
}
}
}
}
that extracts data to be further transformed. In my case, I just need, for each event_type
, the key of the first event_timestamp
bucket and the number of event_timestamp
buckets. This can be done easily in a second step after this extraction, but it would be cleaner to let Elastic do it.
Max bucket (or metric) aggregation does not seems helpful, as I don't see a suitable buckets_path
in the date_histogram
aggregation.
Good start!!
Here is a query that returns you the latest timestamp per event_type
as well as the number of daily buckets for each event_type
(for this you can use the bucket_script
pipeline aggregation which simply returns the number of buckets in the date_histogram
aggregation).
Also note that the filter_path
query string parameter allows you to only return what matters to you, i.e. you can get rid of the date histogram that you're not interested in:
POST test/_search?filter_path=**.latest_timestamp.hits.hits._source,**.bucket_count,**.events.buckets.key
{
"size": 0,
"_source": false,
"query": {...},
"aggregations": {
"events": {
"terms": {
"field": "event_type"
},
"aggs": {
"latest_timestamp": {
"top_hits": {
"size": 1,
"sort": {"@timestamp": "desc"},
"_source": ["@timestamp"]
}
},
"event_timestamp": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "day",
"min_doc_count": 1,
"order": {
"_key": "desc"
}
}
},
"bucket_count": {
"bucket_script": {
"buckets_path": {
"count": "event_timestamp._bucket_count"
},
"script": "params.count"
}
}
}
}
}
}
You'll get for each event type something that looks like this:
{
"key": "2", <--- event type
"latest": {
"hits": {
"hits": [
{
"_source": {
"@timestamp": "2023-01-05" <--- latest timestamp
}
}
]
}
},
"bucket_count": {
"value": 4 <--- bucket count
}
},