Search code examples
elasticsearch

how to write curl fuzzy search query in elastic search


I have created record in ES and can see it:

curl -XGET http://localhost:9200/companies_test/_search -H 'Content-Type: application/json' -d '{
      "query": {
          "match_all": {}
      }
  }'

returns

{
  "took": 38,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "companies_test",
        "_type": "_doc",
        "_id": "0",
        "_score": 1,
        "_source": {
          "regionName": "North America",
          "name": "OpenAI",
          "id": "0"
        }
      }
    ]
  }
}

But when I try to find it:

curl -XGET http://localhost:9200/companies_test/_search -H 'Content-Type: application/json' -d '{
      "query": {
          "fuzzy": {
            "name": {
              "value": "OpenAI",
              "fuzziness": "AUTO"
            }
          }
        }
  }'

it returns nothing:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

Probably my search query is incorrect. How to fix it? The idea is to fuzzy search companies by name in specific region and without region filter as well.


Solution

  • Because "fuzzy" query searches for terms, and when you insert "OpenAI" to your database the text analyzer changed it to "openai", so the "fuzzy" query will consider this search "OpenAI" -> "openai" as 3 changes, then it will return zero matches.

    As proof, try to search for "openAI" -> 2 changes, and you will get results.

    Read more about the fuzziness auto rules here: https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#fuzziness

    As a solution:

    I recommend using search with the fuzziness parameter because it will utilize the text analyzer exactly as you insert the data. In other words, it will apply fuzziness to the text itself, rather than to terms.

    GET /companies_test/_search
    {
      "query": {
        "match": {
          "name": {
            "query": "OpenAI",
            "fuzziness": "AUTO"
          }
        }
      }
    }