Search code examples
elasticsearchquerydsl

Elasticsearch count if more than one document with same value exists


I want the count of documents if the value of a field is same in more than one documents. How can I write a DSL query to do so?

Example:

Let's say I have these documents:

{ _id:1, foo:1}
{ _id:2, foo:1}
{ _id:3, foo:3}
{ _id:4, foo:2}
{ _id:5, foo:3}

I want the count of documents if the same value of foo is found in more than one documents. Here, I want the count as 2.

UPDATE

After running the terms query as:

{
   "size": 0,
   "aggs": {
      "counts": {
          "terms": {
              "field": "foo"
          }
      }
   }
}

I got this result:

'aggregations':{
    'counts':{
        'buckets':[
             {'doc_count': 221,'key': '10284'},
             {'doc_count': 71,'key': '6486'},
             {'doc_count': 71,'key': '7395'}
         ],
        'doc_count_error_upper_bound': 0,
        'sum_other_doc_count': 0
    }
}

I want another field as total_count which has the value 3 as there are 3 keys with doc_count more than 1. How can I do that?


Solution

  • You can try a simple terms aggregation on the foo field like this:

    {
       "size": 0,
       "aggs": {
          "counts": {
              "terms": {
                  "field": "foo"
              }
          }
       }
    }
    

    After running this, you'll get

    • for key 1: doc_count 2
    • for key 3: doc_count 2
    • for key 1: doc_count 1