I have documents in elasticsearch in which each document looks something like as follows:
{
"id": "T12890ADSA12",
"status": “CREATED”,
"type": “ABC”,
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-04-30T13:41:25.862Z"
}
For this document structure, I wanted to get all the documents which are having status as CREATED or SCHEDULED and TYPE is ABC. And in these filtered documents, I want to aggregate the number of documents based on currentDate - createdAt in days bucket. Eg.
And similarly for last 7 days.
Is there an easy way to do this in a single query?
Please find the below mapping, sample docs, aggregation query and response:
PUT my_date_index
{
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"type": {
"type": "keyword"
},
"updatedAt": {
"type": "date"
},
"createdAt": {
"type": "date"
}
}
}
}
POST my_date_index/_doc/1
{
"id": "T12890ADSA12",
"status": "CREATED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-06T05:00:00.000Z"
}
POST my_date_index/_doc/2
{
"id": "T12890ADSA13",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-05T13:41:25.862Z"
}
POST my_date_index/_doc/3
{
"id": "T12890ADSA14",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-04T06:00:00.000Z"
}
POST my_date_index/_doc/4
{
"id": "T12890ADSA15",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-03T07:00:00.000Z"
}
POST my_date_index/_search
{
"size": 0, <----- Remove this to return documents too
"query": {
"bool": {
"must": [
{
"term": {
"type": "ABC"
}
},
{
"range": {
"createdAt": {
"gte": "now-7d",
"lte": "now"
}
}
}
],
"should": [
{
"term": {
"status": "SCHEDULED"
}
},
{
"term": {
"status": "CREATED"
}
}
],
"minimum_should_match": 1
}
},
"aggs": {
"my_date": {
"date_histogram": {
"field": "createdAt",
"calendar_interval": "day",
"order": {
"_key": "desc"
}
}
}
}
}
Note that I've first filtered the documents based on date and the conditions which you've provided.
This would return all the documents. Post which I've applied date histogram query to get the documents for each and every day on that date range.
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"my_date" : {
"buckets" : [
{
"key_as_string" : "2020-07-06T00:00:00.000Z",
"key" : 1593993600000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-05T00:00:00.000Z",
"key" : 1593907200000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-04T00:00:00.000Z",
"key" : 1593820800000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-03T00:00:00.000Z",
"key" : 1593734400000,
"doc_count" : 1
}
]
}
}
}
Hope this helps!