Search code examples
elasticsearchelasticsearch-aggregationelasticsearch-dsl

Elasticsearch DSL 'top_hits' Aggregation Not allowing 'From' Parameter


Elasticsearch aggregation 'top_hits' provides these three options: enter image description here

When I try to write the query with the 'from' parameter in elasticsearch_dsl, it gives an invalid syntax error. Does elasticsearch_dsl not allow the 'from' parameter or am I missing something? Here is my elasticsearch_dsl query:

    s = Search(using=client, index=index).params(size=0)

    s.aggs.bucket('duplicateCount', 'terms', field=field, min_doc_count=2) \
        .bucket('duplicateDocuments', 'top_hits', sort=[{f"{sort_field}":"desc"}], from=2)

Here's the error msg:

enter image description here


Solution

  • Since top_hits is a metric aggregation (i.e. not a bucket one) you need to use the metric() function, not the bucket() one. Also since from is a reserved keyword in Python, you need to put it between single quotes.

    s.aggs.bucket('duplicateCount', 'terms', field=field, min_doc_count=2) \
        .metric('duplicateDocuments', 'top_hits', sort=[{f"{sort_field}":"desc"}], **{'from': 2})
           ^
           |
      change this