I want to get date_histogram during specific period, how to restrict the date period? how should I use the extended_bounds parameter? For example : I want to query the date_histogram between '2024-07-10' and '2024-07-22', and the fixed_interval could be changed. I query with this expression insuring having the same filter range in the query:
GET /logging-events/_search
{
"size": 1,
"query": {
"bool": {
"filter": [
{
"range": {
"timestamp": {
"gt": "2024-07-10T00:00:00",
"lt": "2024-07-22T23:59:59"
}
}
}
]
}
},
"aggregations": {
"date_ranges": {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "8d",
"extended_bounds": {
"min": "2024-07-10",
"max": "2024-07-22"
},
"format":"yyyy-MM-dd"
}
}
}
}
But I still get the date_histogram not in the range.
Unexpected Response :
aggregations.date_ranges.buckets[0].key_as_string eq 2024-07-08
{
"hits": {
"total": {
"value": 13,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "logging-events",
"_id": "BuP86ZAB2uzWuz8xZdkz",
"_score": 0.0,
"_source": {
"_class": "com.ecommerce.monitoring.services.event.LoggingEvent",
"event_type": "CLICK",
"session_id": "b2f4d3a1-4d4e-4f67-9b2a-0e7b0c9b5e2f",
"product_id": 22,
"shop_id": 2,
"timestamp": "2024-07-10T06:15:42.000Z"
}
}
]
},
"aggregations": {
"date_ranges": {
"buckets": [
{
"key_as_string": "2024-07-08",
"key": 1720396800000,
"doc_count": 6
},
{
"key_as_string": "2024-07-16",
"key": 1721088000000,
"doc_count": 7
}
]
}
}
}
Expected: Starts from 2024-07-10
First of all, I think the following sentence in Elasticsearch documentation doesn't capture what's actually going one and might have confused you.
With
extended_bounds
setting, you now can "force" the histogram aggregation to start building buckets on a specific min value and also keep on building buckets up to a max value (even if there are no documents anymore).
You cannot force to build buckets on a specific value, you can only force building empty buckets before and after your last available document, but the bucket will start on values specified by the fixed_interval
regardless of what you specify in extended_bounds
. So, if the fixed interval is 8d
it will only create buckets every 8 days since Jan 1, 1970.
2024-07-08 is 19,912th day after Jan 1, 1970 and 19,912 is divisible by 8 so this is why bucket starts on that date and not on 2024-07-10. To move bucket to 2024-07-10 you need to add 2d
offset to date histogram. So, in your case the query should look like this:
{
"size": 0,
"query": {
"bool": {
"filter": [
{
"range": {
"timestamp": {
"gt": "2024-07-10T00:00:00",
"lt": "2024-07-22T23:59:59"
}
}
}
]
}
},
"aggregations": {
"date_ranges": {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "8d",
"format":"yyyy-MM-dd",
"offset": "2d"
}
}
}
}
You might also want to take a look at using hard_bounds
instead of query to limit buckets that are getting created.