Search code examples
elasticsearchtokenizeanalyzerelasticsearch-analyzers

ES Analyzer which tokens the numbers, digits as well


I am using Elasticsearch in-built Simple analyzer https://www.elastic.co/guide/en/elasticsearch/reference/1.7/analysis-simple-analyzer.html, which uses Lower Case Tokenizer. and text apple 8 IS Awesome is tokenized in the below format.

 "apple",
 "is",
 "awesome"

You can clearly see, that it misses tokenizing the number 8, hence now if I just search with 8, my message will not appear in search.

I went through all the available analyzer available with ES but couldn't find any suitable analyzer which matches my requirement.

How can I tokenize all the words with a number using a custom or in-built analyzer of ES ?


Solution

  • Your question is about the simple analyzer, but you mention a very old link to documentation. Try https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simple-analyzer.html

    Like Val told you, you probably looking for the standard analyser. If you want to see the difference try the analysis api: