I am looking at tokenizer
in Elasticsearch 6.8. I know that it defines how we tokenize the text into words when it builds an index. As an example, It would convert a "Quick brown fox!" text into terms [Quick, brown, fox!]
.
If I have a field in Elasticsearch which has the text "Quick brown fox!"
, it will be broken into three words in the index.
But what if I send a query text "Quick brown fox!"
, does tokenizer
work for that query parameter as well?
Analyzers do work both at indexing time and query time provided they are correctly configured in the field mappings of your index.
On this page, you get a complete description of when an analyzer kicks in, repeated below for clarity:
At index time, Elasticsearch will look for an analyzer in this order:
- The analyzer defined in the field mapping.
- An analyzer named default in the index settings.
- The standard analyzer.
At query time, there are a few more layers:
- The analyzer defined in a full-text query.
- The search_analyzer defined in the field mapping.
- The analyzer defined in the field mapping.
- An analyzer named default_search in the index settings.
- An analyzer named default in the index settings.
- The standard analyzer.
So as you can see, an analyzer can be leveraged both when you ingest data and also when you query it.