Search code examples
elasticsearchanalyzerhyphenation

Elastic, Term (or _ID) query with Hyphen in value


I am struggling to query for exact match. This field is identical in two fields in the document, within _id and within one field in the body.

So I can search either of these fields. Is there any way to configure the term query to support this? I've tried specifying whitespace analyzer but it doesn't seem to be a supported configuration for term queries.

Ive tried a few variations, but none of it has worked so far..

data: {
    query: {
        term: {
            "_id":"4123-0000"            
        }
    }
}

This doesn't return anything.


Solution

  • Issue is that as you are using default mapping, your _id field seems to be populated by you, which would have used text field which uses the standard analyzer and splits the tokens based on -, so your _id field is tokenized as below:

    POST /_analyze

    {
      "text" : "4123-0000",
      "analyzer" : "standard"
    }
    

    And tokens

     {
            "tokens": [
                {
                    "token": "4123",
                    "start_offset": 0,
                    "end_offset": 4,
                    "type": "<NUM>",
                    "position": 0
                },
                {
                    "token": "0000",
                    "start_offset": 5,
                    "end_offset": 9,
                    "type": "<NUM>",
                    "position": 1
                }
            ]
        }
    

    Now as you might be aware of that term query is not analyzed ie it uses the 4123-0000 as it is and tried to find in the inverted index, which is not available hence you don't get any result.

    Solution, simply replace _id to _id.keyword to get the search result.