Search code examples
elasticsearcharabic

Elasticsearch in Arabic text different character shapes


I have an Elasticsearch index contains arabic text field I need to be able to search for "أحمد" and get the result of "أحمد" ,"احمد" and "آحمد"

How to achieve that?


Solution

  • If you are working with Arabic texts, it might make sense to index your data using an Arabic analyzer. Here is a quick example to get you started, but I would strongly suggest reading about and understanding how analysis work.

    DELETE arabic_example
    PUT arabic_example
    {
      "mappings": {
        "properties": {
         "text": {
            "type": "text",
            "analyzer": "arabic"
          }
        }
      }
    }
    
    PUT arabic_example/_bulk?refresh
    {"index": {}}
    {"text":"احمد"}
    {"index": {}}
    {"text":"أحمد"}
    {"index": {}}
    {"text":"آحمد"}
    
    POST arabic_example/_search
    {
      "query": {
        "match": {
          "text": "أحمد"
        }
      }
    }