Search code examples
elasticsearchelasticsearch-aggregationelasticsearch-dsl

Filter aggregation keys with non nested mapping in elasticsearch


I have following mapping:

{
  "Country": {
    "properties": {
      "State": {
        "properties": {
          "Name": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          },
          "Code": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          },
          "Lang": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

This is sample document:

{
  "Country": {
    "State": [
      {
        "Name": "California",
        "Code": "CA",
        "Lang": "EN"
      },
      {
        "Name": "Alaska",
        "Code": "AK",
        "Lang": "EN"
      },
      {
        "Name": "Texas",
        "Code": "TX",
        "Lang": "EN"
      }
    ]
  }
}

I am querying on this index to get aggregates of count of states by name. I am using following query:

{
  "from": 0,
  "size": 0,
  "query": {
    "query_string": {
      "query": "Country.State.Name: *Ala*"
    }
  },
  "aggs": {
    "counts": {
      "terms": {
        "field": "Country.State.Name.raw",
        "include": ".*Ala.*"
      }
    }
  }
}

I am able to get only keys matching with query_string using include regex in terms aggregation but seems there is no way to make it case insensitive regex in include.

The result I want is:

{
  "aggregations": {
    "counts": {
      "buckets": [
        {
          "key": "Alaska",
          "doc_count": 1
        }
      ]
    }
  }
}

Is there other solution available to get me only keys matching query_string without using nested mapping?


Solution

  • I was able to fix the problem by using inline script to filter the keys. (Still a dirty fix but it solves my use case for now and I can avoid mapping changes)

    Here is how I am executing query.

    {
      "from": 0,
      "size": 0,
      "query": {
        "query_string": {
          "query": "Country.State.Name: *Ala*"
        }
      },
      "aggs": {
        "counts": {
          "terms": {
            "script": {
              "source": "doc['Country.State.Name.raw'].value.toLowerCase().contains('ala') ? doc['Country.State.Name.raw'].value : null",
              "lang": "painless"
            }
          }
        }
      }
    }