I have some documents indexed on Elasticsearch, looking like these samples:
{'country': 'France', 'collected': '2015-03-12', 'value': 20}
{'country': 'Canada', 'collected': '2015-03-12', 'value': 21}
{'country': 'Brazil', 'collected': '2015-03-12', 'value': 33}
{'country': 'France', 'collected': '2015-02-01', 'value': 10}
{'country': 'Canada', 'collected': '2015-02-01', 'value': 11}
{'country': 'Mexico', 'collected': '2015-02-01', 'value': 9}
...
I want to build a query that gets one result per country, getting only the ones with max(collected)
.
So, for the examples shown above, the results would be something like:
{'country': 'France', 'collected': '2015-03-12', 'value': 20}
{'country': 'Canada', 'collected': '2015-03-12', 'value': 21}
{'country': 'Brazil', 'collected': '2015-03-12', 'value': 33}
{'country': 'Mexico', 'collected': '2015-02-01', 'value': 9}
I realized I need to do aggregation on country
, but I'm failing to understand how to limit the results on max(collected)
.
Any ideas?
You can use a top_hits
aggregation that groups on the country
field, returns 1 doc per group, and orders the docs by the collected date descending:
POST /test/_search?search_type=count
{
"aggs": {
"group": {
"terms": {
"field": "country"
},
"aggs": {
"group_docs": {
"top_hits": {
"size": 1,
"sort": [
{
"collected": {
"order": "desc"
}
}
]
}
}
}
}
}
}