Search code examples
elasticsearchelasticsearch-7

i want to search for documents that field exists only search term in elasticsearch


part of my document mapping below:

"character_cut": {
  "type": "keyword"
}

and sample data is here.

doc1 character_cut: ["John"]

doc2 character_cut: ["John", "Smith"]

doc3 character_cut: ["Smith", "Jessica", "Anna"]

doc4 character_cut: ["John"]

if i find "John" will retrive doc1, doc2, doc4.

how can i retrive only doc1, doc4 with "John" query?


Solution

  • There are 2 ways to do it.

    1. Token_count

    A field of type token_count is really an integer field which accepts string values, analyzes them, then indexes the number of tokens in the string.

    PUT index-name
    {
      "mappings": {
        "properties": {
          "character_cut":{
            "type": "text",
            "fields": {
              "keyword":{
                "type":"keyword"
              },
              "length":{
                "type":"token_count", ---> no of keyword tokens
                "analyzer":"keyword"
              }
            }
          }
        }
      }
    }
    

    Query

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "character_cut.keyword": {
                  "value": "John"
                }
              }
            },
            {
             "term": {
               "character_cut.length": {
                 "value": 1    --> replace with no of  matches required
               }
             }
            }
          ]
        }
      }
    }
    

    2. Using script query

    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "character_cut.keyword": {
                  "value": "John"
                }
              }
            },
            {
             "script": {
               "script": "doc['character_cut.keyword'].size()==1"
                                       --> replace with no of  matches required
             }
            }
          ]
        }
      }
    }
    

    token_count will calculate count at index time so it will be faster than script which will compute at run time