Search code examples
elasticsearchelasticsearch-aggregation

Elasticsearch Aggregations: Only return results of one of them?


I'm trying to find a way to only return the results of one aggregation in an Elasticsearch query. I have a max bucket aggregation (the one that I want to see) that is calculated from a sum bucket aggregation based on a date histogram aggregation. Right now, I have to go through 1,440 results to get to the one I want to see. I've already removed the results of the base query with the size: 0 modifier, but is there a way to do something similar with the aggregations as well? I've tried slipping the same thing into a few places with no luck.

Here's the query:

{
    "size": 0,
    "query": {
        "range": {
            "timestamp": {
                "gte": "2018-11-28",
                "lte": "2018-11-28"
            }
        }
    },
    "aggs": {
        "hits_per_minute": {
            "date_histogram": {
                "field": "timestamp",
                "interval": "minute"
            },
            "aggs": {
                "total_hits": {
                    "sum": {
                        "field": "hits_count"
                    }
                }
            }
        },
        "max_transactions_per_minute": {
            "max_bucket": {
                "buckets_path": "hits_per_minute>total_hits"
            }
        }
    }
}

Solution

  • Fortunately enough, you can do that with bucket_sort aggregation, which was added in Elasticsearch 6.4.

    Do it with bucket_sort

    POST my_index/doc/_search
    {
      "size": 0,
      "query": {
        "range": {
          "timestamp": {
            "gte": "2018-11-28",
            "lte": "2018-11-28"
          }
        }
      },
      "aggs": {
        "hits_per_minute": {
          "date_histogram": {
            "field": "timestamp",
            "interval": "minute"
          },
          "aggs": {
            "total_hits": {
              "sum": {
                "field": "hits_count"
              }
            },
            "max_transactions_per_minute": {
              "bucket_sort": {
                "sort": [
                  {"total_hits": {"order": "desc"}}
                ],
                "size": 1
              }
            }
          }
        }
      }
    }
    

    This will give you a response like this:

    {
      ...
      "aggregations": {
        "hits_per_minute": {
          "buckets": [
            {
              "key_as_string": "2018-11-28T21:10:00.000Z",
              "key": 1543957800000,
              "doc_count": 3,
              "total_hits": {
                "value": 11
              }
            }
          ]
        }
      }
    }
    

    Note that there is no extra aggregation in the output and the output of hits_per_minute is truncated (because we asked to give exactly one, topmost bucket).

    Do it with filter_path

    There is also a generic way to filter the output of Elasticsearch: Response filtering, as this answer suggests.

    In this case it will be enough to just do the following query:

    POST my_index/doc/_search?filter_path=aggregations.max_transactions_per_minute
    { ... (original query) ... }
    

    That would give the response:

    {
      "aggregations": {
        "max_transactions_per_minute": {
          "value": 11,
          "keys": [
            "2018-12-04T21:10:00.000Z"
          ]
        }
      }
    }