elasticsearch logstash kibana elasticsearch-5 elasticsearch-dsl

How to match exact document data in elasticsearch using DSL query?

My tokenizer

 "tokenizer": {
        "my_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 10,
          "token_chars": [
            "letter",
            "digit"
          ]
        }

I am trying to search the value based on this fields but the prob here is whenever, I want to search on the basis of token like suppose If I search with s token then I should get items matching or starting to s , now If i search with sp I want to get item starting from sp discarding other things , I just want to get the value starting with sp and discard all , I am not getting is my query wrong or filter I have used thats wrong can someone pls help me with this

 {
     "query": {
      "bool": {
       "must": [
        {
         "multi_match": {
          "query": "PRODUCT",
          "fields": [
           "item",
           "data1"
          ]
         }
        },
        {
         "multi_match": {
          "query": "SUB_FAMILY",
          "fields": [
           "item",
           "data1"
          ]
         }
        },
        {
         "match": {
          "values": "SP"
         }
        }
       ]
      }
     }
    }

The output for this query is

 "hits": [
                {
                    "_index": "logs_datas",
                    "_type": "_doc",
                    "_id": "H1PfEnkBQXpKNrJSp8bV",
                    "_score": 9.418445,
                    "_source": {
                        "message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
                        "path": "/home/elasticsearchDatas.csv",
                        "hierarchy_name": "PRODUCT",
                        "@version": "1",
                        "@timestamp": "2021-04-27T10:28:37.578Z",
                        "host": "ewiglp71",
                        "item_pk": "SPRINHO2H",
                        "attribute_name": "SUB_FAMILY"
                    }
                },
                {
                    "_index": "logs_datas",
                    "_type": "_doc",
                    "_id": "y1PfEnkBQXpKNrJSp8XQ",
                    "_score": 5.3059187,
                    "_source": {
                        "message": "PRODUCT,SUB_FAMILY,SCMLPLWVI",
                        "path": "/home/niteshb/elasticsearchDatas.csv",
                        "hierarchy_name": "PRODUCT",
                        "@version": "1",
                        "@timestamp": "2021-04-27T10:28:37.577Z",
                        "host": "ewiglp71",
                        "item_pk": "SCMLPLWVI",
                        "attribute_name": "SUB_FAMILY"
                    }
                },
                {
                    "_index": "logs_datas",
                    "_type": "_doc",
                    "_id": "zFPfEnkBQXpKNrJSp8XQ",
                    "_score": 5.3059187,
                    "_source": {
                        "message": "PRODUCT,SUB_FAMILY,SSVRKEN2Z",
                        "path": "/home/elasticsearchDatas.csv",
                        "hierarchy_name": "PRODUCT",
                        "@version": "1",
                        "@timestamp": "2021-04-27T10:28:37.579Z",
                        "host": "ewiglp71",
                        "item_pk": "SSVRKEN2Z",
                        "attribute_name": "SUB_FAMILY"
                    }
                }
                }
            ]
        }
    }

Solution

Since the min_gram is 1, so the tokens generated for SCMLPLWVI will be

{
  "tokens": [
    {
      "token": "S",
      "start_offset": 0,
      "end_offset": 1,
      "type": "word",
      "position": 0
    },
    {
      "token": "SC",
      "start_offset": 0,
      "end_offset": 2,
      "type": "word",
      "position": 1
    },
    {
      "token": "SCM",
      "start_offset": 0,
      "end_offset": 3,
      "type": "word",
      "position": 2
    },
    {
      "token": "SCML",
      "start_offset": 0,
      "end_offset": 4,
      "type": "word",
      "position": 3
    },
    {
      "token": "SCMLP",
      "start_offset": 0,
      "end_offset": 5,
      "type": "word",
      "position": 4
    },
    {
      "token": "SCMLPL",
      "start_offset": 0,
      "end_offset": 6,
      "type": "word",
      "position": 5
    },
    {
      "token": "SCMLPLW",
      "start_offset": 0,
      "end_offset": 7,
      "type": "word",
      "position": 6
    },
    {
      "token": "SCMLPLWV",
      "start_offset": 0,
      "end_offset": 8,
      "type": "word",
      "position": 7
    },
    {
      "token": "SCMLPLWVI",
      "start_offset": 0,
      "end_offset": 9,
      "type": "word",
      "position": 8
    }
  ]
}

If you want to get the value starting with sp then you need to modify your tokenizer as

 "tokenizer": {
        "my_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,          // note this
          "max_gram": 10,
          "token_chars": [
            "letter",
            "digit"
          ]
        }

Update 1:

You can use a match_bool_prefix to search for words starting with s or sp

Adding a working example

Index Mapping:

{
  "mappings": {
    "properties": {
      "item_pk": {
        "type": "text"
      }
    }
  }
}

Search Query 1:

{
  "query": {
    "match_bool_prefix" : {
      "item_pk" : "s"
    }
  }
}

Search Result will be

"hits": [
      {
        "_index": "67281810",
        "_type": "_doc",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
          "path": "/home/niteshb/elasticsearchDatas.csv",
          "hierarchy_name": "PRODUCT",
          "@version": "1",
          "@timestamp": "2021-04-27T10:28:37.578Z",
          "host": "ewiglp71",
          "item_pk": "SPRINHO2H",
          "attribute_name": "SUB_FAMILY"
        }
      },
      {
        "_index": "67281810",
        "_type": "_doc",
        "_id": "i7quE3kB6jKCA-nFYii6",
        "_score": 1.0,
        "_source": {
          "message": "PRODUCT,SUB_FAMILY,SCMLPLWVI",
          "path": "/home/niteshb/elasticsearchDatas.csv",
          "hierarchy_name": "PRODUCT",
          "@version": "1",
          "@timestamp": "2021-04-27T10:28:37.577Z",
          "host": "ewiglp71",
          "item_pk": "SCMLPLWVI",
          "attribute_name": "SUB_FAMILY"
        }
      },
      {
        "_index": "67281810",
        "_type": "_doc",
        "_id": "jLquE3kB6jKCA-nFgiju",
        "_score": 1.0,
        "_source": {
          "message": "PRODUCT,SUB_FAMILY,SSVRKEN2Z",
          "path": "/home/niteshb/elasticsearchDatas.csv",
          "hierarchy_name": "PRODUCT",
          "@version": "1",
          "@timestamp": "2021-04-27T10:28:37.579Z",
          "host": "ewiglp71",
          "item_pk": "SSVRKEN2Z",
          "attribute_name": "SUB_FAMILY"
        }
      }
    ]

Search Query 2:

{
  "query": {
    "match_bool_prefix" : {
      "item_pk" : "sp"
    }
  }
}

Search Result:

"hits": [
      {
        "_index": "67281810",
        "_type": "_doc",
        "_id": "1",
        "_score": 1.0,
        "_source": {
          "message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
          "path": "/home/niteshb/elasticsearchDatas.csv",
          "hierarchy_name": "PRODUCT",
          "@version": "1",
          "@timestamp": "2021-04-27T10:28:37.578Z",
          "host": "ewiglp71",
          "item_pk": "SPRINHO2H",
          "attribute_name": "SUB_FAMILY"
        }
      }
    ]

Update 2:

Try with this query

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "hierarchy_name": "PRODUCT"
          }
        },
        {
          "match": {
            "attribute_name": "SUB_FAMILY"
          }
        },
        {
          "match_bool_prefix": {
            "item_pk": "sp"
          }
        }
      ]
    }
  }
}