Search code examples
elasticsearchaggregation

Elasticsearch - Aggregation on multiple fields in the same nested scope


I'm aggregating product search results by tags, which have a name and ID fields. How do I get both fields back in an aggregation bucket? I can get one or the other, but I can't figure out how to get both. BTW, script access is turned off on my cluster, so I can't use that.

Here's my product mapping (simplified for this question):

"mappings": {
    "products": {
        "properties": { 
            "title": {
                "type": "string"
            },
            "description": {
                "type": "string"
            },
            "topics": {
                "properties": {
                    "id": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "name": {
                        "type" : "string",
                        "index": "not_analyzed"
                    }
                }
            }
        }
    }
}

Here's my query:

"query": {
    "multi_match": {
        "query": "Testing 1 2 3",
        "fields": ["title", "description"]
    },
    "aggs": {
        "Topics": {
            "terms": {
                "field": "topics.id",
                "size": 15
            }
        }
    }
}

My aggregation buckets look like this:

enter image description here

...the "key" value in the first bucket is the topics.id field value. Is there a way to add my topics.name field to the bucket?


Solution

  • If you want to add another field as a key in your bucket, then (id,name) will act as a unique bucket. You need an association between an id and a name. Without nested mapping, the list of ids and names are separate arrays. Hence, you need to map it as nested.

    "topics": {
        "type": "nested",
        "properties": {
            "id": {
                "type": "string",
                "index": "not_analyzed"
            },
            "name": {
                "type": "string",
                "index": "not_analyzed"
            }
        }
    }
    

    For aggregation on multiple fields, you need to use sub-aggregations.

    Here is a sample aggregation query:

     {
          "aggs": {
            "topics_agg": {
              "nested": {
                "path": "topics"
              },
              "aggs": {
                "name": {
                  "terms": {
                    "field": "topics.id"
                  },
                  "aggs": {
                    "name": {
                      "terms": {
                        "field": "topics.name"
                      }
                    }
                  }
                }
              }
            }
          }
        }
    

    Aggregation Sample Result :

        "aggregations": {
              "topics_agg": {
                 "doc_count": 5,
                 "name": {
                    "doc_count_error_upper_bound": 0,
                    "sum_other_doc_count": 0,
                    "buckets": [
                       {
                          "id": 123,
                          "doc_count": 6,
                          "name": {
                             "doc_count_error_upper_bound": 0,
                             "sum_other_doc_count": 0,
                             "buckets": [
                                {
                                   "key": "topic1",
                                   "doc_count": 3
                                },
                                {
                                   "key": "topic2",
                                   "doc_count": 3
                                }
                             ]
                          }
                       },
                       {
                          "key": 456,
                          "doc_count": 2,
                          "name": {
                             "doc_count_error_upper_bound": 0,
                             "sum_other_doc_count": 0,
                             "buckets": [
                                {
                                   "key": "topic1",
                                   "doc_count": 2
                                }
                             ]
                          }
                       },
    ..............
    

    Note : For id : 123, there are multiple names buckets. As there are multiple name values for the same id. To create separate unique buckets, just create all the parent-child combinations.

    For eg. 123-topic1, 123-topic2, 456-topic1