Search code examples
jsonelasticsearchgroupingnosql-aggregationnosql

Elastic Search aggregation enhanced filtering for nested query


I have the following objects indexed:

{ "ProjectName" : "Project 1",
  "Roles" : [
     { "RoleName" : "Role 1", "AddedAt" : "2015-08-14T17:11:31" },
     { "RoleName" : "Role 2", "AddedAt" : "2015-09-14T17:11:31" } ] }

{ "ProjectName" : "Project 2",
  "Roles" : [
     { "RoleName" : "Role 1", "AddedAt" : "2015-10-14T17:11:31" } ] }

{ "ProjectName" : "Project 3",
  "Roles" : [
     { "RoleName" : "Role 2", "AddedAt" : "2015-11-14T17:11:31" } ] }

I.e., a list of projects with different roles added, added in different time. (Roles list is a nested field)

What I need is to have aggregation which would select how many projects exist per certain role, BUT only(!) if the role was added to the project in certain period.

A classic query (without the dates rande filtering) looks like this (and works well):

{ // ... my main query here

"aggs" : {
    "agg1" : {
        "nested" : {
            "path" : "Roles"
        },
        "aggs" : {
            "agg2": {                    
                "terms": {
                    "field" : "Roles.RoleName"
                },
                "aggs": {
                    "agg3":{
                        "reverse_nested": {}
                    }}}}}}

But this approach is not working for me, because if I need filtering by dates starting from let's say '2015-09-01', both 'Role 1' and 'Role 2' would be selected for the first project (i.e., the project for them) as the 'Role 1' would hit because 'Role 2''s project hits because of the 'Role 2' AddedAt date criterium.

So, I consider, I should add the following condition somewhere additionally:

"range": { "Roles.AddedAt": {
                                "gte": "2015-09-01T00:00:00",
                                "lte": "2015-12-02T23:59:59"
                             }}

But I can not find a correct way to do that.


The results of the working query are (kind of) the following:

"aggregations": {
  "agg1": {
     "doc_count": 17,
     "agg2": {
        "buckets": [
           {
              "key": "Role 1",
              "doc_count": 2,
              "agg3": {
                 "doc_count": 2
              }
           },
           {
              "key": "Role 2",
              "doc_count": 2,
              "agg3": {
                 "doc_count": 2
              }
           },

Solution

  • Try this:

    {
      "aggs": {
        "agg1": {
          "nested": {
            "path": "Roles"
          },
          "aggs": {
            "NAME": {
              "filter": {
                "query": {
                  "range": {
                    "Roles.AddedAt": {
                      "gte": "2015-09-01T00:00:00",
                      "lte": "2015-12-02T23:59:59"
                    }
                  }
                }
              },
              "aggs": {
                "agg2": {
                  "terms": {
                    "field": "Roles.RoleName"
                  },
                  "aggs": {
                    "agg3": {
                      "reverse_nested": {}
                    }
                  }
                }
              }
            }
          }
        }
      }
    }