Search code examples
pythonelasticsearch-dsl

Translating Elasticsearch request from Kibana into elasticsearch-dsl


Recently migrated from AWS Elasticsearch Service (used Elasticsearch 1.5.2) to Elastic Cloud (currently using Elasticsearch 5.1.2). Glad I did it, but with that change comes a newer version of Elasticsearch and newer API's. Struggling to get my head around the new way of requesting stuff. Formerly, I could more or less copy/paste from Kibana's "Elasticsearch Request Body", adjust a few things, run elasticsearch.Elasticsearch.search() and get what I expect.

Here's my Elasticsearch Request Body from Kibana (for brevity, removed some of the extraneous stuff that Kibana usually inserts):

{
  "size": 500,
  "sort": [
    {
      "Time.ISO8601": {
        "order": "desc",
        "unmapped_type": "boolean"
      }
    }
  ],
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "Message\\ ID: 2003",
            "analyze_wildcard": true
          }
        },
        {
          "range": {
            "Time.ISO8601": {
              "gte": 1484355455678,
              "lte": 1484359055678,
              "format": "epoch_millis"
            }
          }
        }
      ],
      "must_not": []
    }
  },
  "stored_fields": [
    "*"
  ],
  "script_fields": {},
}

Now I want to use elasticsearch-dsl to do it, since that seems to be the recommended method (instead of using elasticsearch-py). How would I translate the above into elasticsearch-dsl?

Here's what I have so far:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search, Q

client = Elasticsearch(
        hosts=['HASH.REGION.aws.found.io/elasticsearch'],
        use_ssl=True,
        port=443,
        http_auth=('USER','PASS')
    )

s = Search(using=client, index="emp*")
s = s.query("query_string", query="Message\ ID:2003", analyze_wildcards=True)
s = s.query("range", **{"Time.ISO8601": {"gte": 1484355455678, "lte": 1484359055678, "format": "epoch_millis"}})
s = s.sort("Time.ISO8601")

response = s.execute()

for hit in response:
    print '%s %s' % (hit['Time']['ISO8601'], hit['Message ID']) 

My code written as above is not giving me what I expect. Getting results that include stuff that doesn't match "Message\ ID:2003", and also it's giving me things outside the requested range of Time.ISO8601 as well.

Totally new to elasticsearch-dsl and ES 5.1.2's way of doing things, so I know I've got lots to learn. What am I doing wrong? Thanks in advance for the help!


Solution

  • I don't have elasticsearch running right now but the query looks like what you wanted (you can always see the query produced by looking at s.to_dict()) with the exception of escaping the \ sign. In the original query it was escaped yet in python the result might be different due to different escaping.

    I wuld strongly advise to not have spaces in your fields and also to use a more structured query than query_string:

    s = Search(using=client, index="emp*")
    s = s.filter("term", message_id=2003)
    s = s.query("range", Time__ISO8601={"gte": 1484355455678, "lte": 1484359055678, "format": "epoch_millis"})
    s = s.sort("Time.ISO8601")
    

    Note that I also changed query() to filter() for a slight speedup and used __ instead of . in the field name keyword argument. elasticsearch-dsl will automatically expand that to ..

    Hope this helps...