Search code examples
elasticsearchelasticsearch-aggregation

How can I filter doc_count value which is a result of a nested aggregation


How can I filter the doc_count value which is a result of a nested aggregation? Here is my query:

    "aggs": {
            "CDIDs": {
              "terms": {
                "field": "CDID.keyword",
                "size": 1000
              },
              "aggs": {
                "my_filter": {
                  "filter": {
                    "range": {
                      "transactionDate": {
                        "gte": "now-1M/M"
                      }
                    }
                  }
                },
                "in_active": {
                  "bucket_selector": {
                    "buckets_path": {
                      "doc_count": "_count"
                    },
                    "script": "params.doc_count > 4"
                  }
                }
              }
            }
          }

The result of the query looks like:

      {
          "aggregations" : {
            "CDIDs" : {
              "doc_count_error_upper_bound" : 0,
              "sum_other_doc_count" : 2386,
              "buckets" : [
                {
                  "key" : "1234567",
                  "doc_count" : 5,
                  "my_filter" : {
                    "doc_count" : 4
                  }
                },
                {
                  "key" : "12345",
                  "doc_count" : 5,
                  "my_filter" : {
                    "doc_count" : 5
                  }
                }
              ]
            }
          }
        }

I'm trying to filter the second doc_count value here. Let's say I wanna have docs that are > 4 so the result should be having only one aggregation result in a bucket with doc_count = 5. Can anyone help how can I do this filter? Please let me know if any additional information is required.


Solution

  • Take a close look at the bucket_selector aggregation. You simply need to specify the aggregation name in buckets_path section i.e. "doc_count":"my_filter>_count"

    Pipeline aggregation (buckets_path) has its own syntax where > acts as a separator. Refer to this LINK for more information on this.

    Aggregation Query

    POST <your_index_name>/_search
    {  
       "size":0,
       "aggs":{  
          "CDIDs":{  
             "terms":{  
                "field":"CDID.keyword",
                "size":1000
             },
             "aggs":{  
                "my_filter":{  
                   "filter":{  
                      "range":{  
                         "transactionDate":{  
                            "gte":"now-1M/M"
                         }
                      }
                   }
                },
                "in_active":{  
                   "bucket_selector":{  
                      "buckets_path":{  
                         "doc_count":"my_filter>_count"
                      },
                      "script":"params.doc_count > 4"
                   }
                }
             }
          }
       }
    }
    

    Hope it helps!