Search code examples
pythonelasticsearchelasticsearch-dsl

ElasticSearch-dsl Create Query


Hello Everyone:

I have been trying for a long time to replicate this query using ElasticSearch-dsl Search() class but unfortunately i have not been able to get it.

The query i want to replicate is :

{
    "_source": {
            "includes": [ "SendingTime","Symbol","NoMDEntries","*"]
        },
        "from" : 0, "size" : 10000,
  "query": {
    "bool": {
      "must": [
        {
            "range": {
            "SendingTime": {
              "gte": "Oct 3, 2018 08:00:00 AM",
              "lt": "Oct 3, 2018 02:00:59 PM"
            }
          }
        }
      ]
    }
  }
}

Where datetimes would be replaced by a variable eventually.

So far the only thing i've been able to do is:

search = Search(using=elastic_search, index="bcs-md-bmk-prod")\
    .query("bool", range= {"SendingTime" : {'gte': format_date(datetime.now() - relativedelta(days=1)), 'lt': format_date(datetime.now())}})\

I know i'm really far away from what i want to get, so i'd appreciate if anyone could help me.


Solution

  • There are multiple ways to construct the same query in elasticsearch-dsl, which is for users' convenience but sometime (maybe often) makes new users more confusing.

    Firstly, there is a one-to-one match between each raw query and elasticsearch-dsl query. For example, the following are equivalent:

    # 1
    'query': {
        'multi_match': {
            'query': 'whatever you are looking for',
            'fields': ['title', 'content', 'footnote']
        }
    }
    # 2
    from elasticsearch_dsl.query import MultiMatch
    MultiMatch(query='whatever you are looking for', fields=['title', 'content', 'footnote'])
    

    Secondly, these pairs are equivalent in elasticsearh-dsl:

    # 1 - using a class
    from elasticsearch_dsl.query import MultiMatch
    MultiMatch(query='whatever you are looking for', fields=['title', 'content', 'footnote'])
    # 2 - using Q shortcut
    Q('multi_match', query='whatever you are looking for', fields=['title', 'content', 'footnote'])
    

    and

    # 1 - using query type + keyword arguments 
    Q('multi_match', query='whatever your are looking for', fields=['title', 'content', 'footnote'])
    # 2 - using dict representation
    Q({'multi_match': {'query': 'whatever your are looking for', 'fields': ['title', 'content', 'footnote']}})
    

    and

    # 1 - using Q shortcut
    q = Q('multi_match', query='whatever your are looking for', fields=['title', 'content', 'footnote'])
    s.query(q)
    # 2 - using parameters for Q directly
    s.query('multi_match', query='whatever your are looking for', fields=['title', 'content', 'footnote'])
    

    Now, if we recall the structure of a bool query, it consists of boolean clauses, each clause with a 'typed occurrence' (must, should, must_not, etc.). Since each clause is also a 'query' (in your case a range query), it follows the same pattern as a 'query', which means it can be represented by a Q shortcut.

    So, the way I would construct your query is:

    search = Search(using=elastic_search, index="bcs-md-bmk-prod")
              .query(Q('bool', must=[Q('range', SendingTime={"gte": "Oct 3, 2018 08:00:00 AM", "lt": "Oct 3, 2018 02:00:59 PM"})]))
              .source(includes=["SendingTime","Symbol","NoMDEntries","*"])
    

    Note that the first Q can be removed for simplicity, making that line:

    .query('bool', must=[Q('range', SendingTime={"gte": "Oct 3, 2018 08:00:00 AM", "lt": "Oct 3, 2018 02:00:59 PM"})])
    

    but I'd keep it so that it's easier to understand. Feel free to trade off between different representations.

    Last but not least, you can always fallback to the raw dict representation by using from_dict() method of elasticsearch_dsl.Search class when you have difficulty constructing a query in elasticsearch-dsl.