Search code examples
elasticsearch

Elasticsearch term query on date field


Given some_date field is defined as a date within mapping, if I want to match documents against an exact date, the recommended approach is to use range query with lte and gte.

{
    "range" : {
        "some_date" : {
            "gte": "2020-01-01",
            "lte": "2020-01-01",
            "format": "yyyy-MM-dd"
         }
     }
}

However, it seems to work with the normal term query as well

"term" : {
    "some_date" : {
        "value" : "2020-01-01"
    }
}

This above query works okay and returns the expected result. However, there is no definitive documentation about this. Elasticsearch document about term query does not mention dates at all.

It advises against using term queries for text fields, but does not say whether term query can be used for things like dates, numbers, boolens - however everything seems to work.

My question is: should we rely on the fact that date fields behave as expected within term queries, or, should we absolutely always use the range query.

Note that range is a type of term level query. The phrase "term level query" defines a set of queries. term, terms, range, etc. are all "term level queies".


Solution

  • A term query on a date field is actually implemented as a range query (see this issue) that filters the whole day, i.e.

    {
      "term": {
         "date": "2015-12-12"
      }
    }
    

    is equivalent to

    {
      "range": {
         "date": {
            "gte": "2015-12-12T00:00:00Z",
            "lte": "2015-12-13T23:59:59Z"
      }
    }
    

    It's also visible if you profile your query:

    POST test/_search
    {
      "profile": true, 
      "query":{
        "term" : {
           "date":  "2024-01-01"
        }
      }
    }
    

    In the response you get the following:

    IndexOrDocValuesQuery(indexQuery=date:[1704067200000 TO 1704153599999], dvQuery=date:[1704067200000 TO 1704153599999])
    

    where:

    • 1704067200000 is 2024-01-01T00:00:00Z
    • 1704153599999 is 2024-01-01T23:59:59Z

    So if you use the term query, you have one additional (date) parsing and (term to range) translation step, it's thus more "optimal" to use the range query directly.