Search code examples
elasticsearch

How to add fuzziness to text field if the field contains date


I have a field that contains date (format 2023-12-25) but its text type. I need to do a date search with a possible typo. Tried to do it like this

POST _msearch/
{"index" : "some_index"}
{"query": {"bool" : {"must" : [{"match": {"birthDate": {"query": "1939-02-21", "fuzziness": 1}}}]}}}

It doesnt work. If i change the date to 1939-05-17, the result remains the same. Even if i try 17-05-1939.

It's strange, but with other text-type fields fuzziness works correctly.

I expect elastic to find the rigth date (for example, 1939-02-21) if the input is 1939-02-22 or 1939-03-21. But not the 1939-03-22.


Solution

  • Elasticsearch measure the similarity between two text string with Levenshtein distance during fuzzy query.

    In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences.

    Even though you index the date as a string, you cannot measure the similarity between numbers, special characters, or anything expect alphabet. e.g. You can measure the similarity between "shark" and "shard" but NOT "1" and "2" or "*" and "#". So you cannot measure the similarity between dates too.

    Maybe, you can use range query with some dynamic values.

    GET /_search
    {
      "query": {
        "range": {
          "timestamp": {
            "gte": "now-30d/d",
            "lte": "now/d".  <-- your filter here
          }
        }
      }
    }