Search code examples
pythonelasticsearchelasticsearch-py

Python elasticsearch range query


I know that there are several alternative elasticsearch clients for python beyond this one. However, I do not have access to those. How can I write a query that has a 'less than or equal' logic for a timestamp? My current way of doing this is:

query = group_id:" + gid + '" AND data_model.fields.price:' + price
less_than_time = # datetime object
data = self.es.search(index=self.es_index, q=query, size=searchsize)
hits = data['hits']['hits']
results = []
for hit in hits:
    time = datetime.strptime(hit['_source']['data_model']['utc_time'], time_format)
    dt = abs(time - less_than_time).seconds
    if dt <= 0:
        results.append(hit)

This is a really clumsy way of doing it. Is there a way I can keep my query generation using strings and include a range?


Solution

  • I have a little script that generates a query for me. The query however is in the json notation (which I believe the client can use).

    here's my script:

    #!/usr/bin/python
    
    from datetime import datetime
    import sys
    
    RANGE = '"range":{"@timestamp":{"gte":"%s","lt":"%s"}}'
    QUERY = '{"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{%s}]}}}'
    
    if __name__ == "__main__":
        if len(sys.argv) < 3:
            print "\nERROR: 2 Date arguments needed: From and To, for example:\n\n./range_query.py 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z\n\n"
            sys.exit(1)
        try:
            date1 = datetime.strptime(sys.argv[1], "%Y-%m-%dT%H:%M:%S.%fZ")
            date2 = datetime.strptime(sys.argv[2], "%Y-%m-%dT%H:%M:%S.%fZ")
    
        except:
            print "\nERROR: Invalid dates. From: %s, To: %s" %(sys.argv[1], sys.argv[2]) + "\n\nValid date format: %Y-%m-%dT%H:%M:%S.%fZ\n"
            sys.exit(1)
    
        range_q = RANGE %(sys.argv[1], sys.argv[2])
    
    
        print(QUERY %(range_q))
    

    The script also uses a bool query. It should be fairly easy to remove that and use only the time constraints for the range.

    I hope this is what you're looking for.

    This can be called and spits out a query such as:

    ./range_prefix_query.py.tmp 2016-08-10T00:00:00.000Z 2016-08-10T00:00:00.000Z
    {"query":{"bool":{"must":[{"prefix": {"myType":"test"}},{"range":{"@timestamp":{"gte":"2016-08-10T00:00:00.000Z","lt":"2016-08-10T00:00:00.000Z"}}}]}}}
    

    Artur