Search code examples
elasticsearchtokenize

elasticsearch custom tokenizer don't split time by ":"


for example, I have log like this:

11:22:33 user:abc&game:cde

if I use the standard tokenizer, this log will be split to :

 11  22   33  user  abc  game  cde

but 11:22:33 means time, I don't want to split it, I want to use custom tokenizer to split it to:

11:22:33  user abc  game  cde

so, how should I set the tokenizer?


Solution

  • You can use pattern tokenizer in order to achieve that.

    A tokenizer of type pattern that can flexibly separate text into terms via a regular expression

    Read more here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html