I need to tokenize string 36-3031.00|36-3021.00
to 36-3031.00
and 36-3021.00
using |
I have tried like this,
PUT text
"test1": {
"settings": {
"analysis" : {
"tokenizer" : {
"pipe_tokenizer" : {
"type" : "pattern",
"pattern" : "|"
"analyzer" : {
"pipe_analyzer" : {
"type" : "custom",
"tokenizer" : "pipe_tokenizer"
"mappings": {
"mytype": {
"properties": {
"text": {
"type": "string",
"analyzer": "pipe_analyzer"
But it does't produce exact. Can anyone sort out this use case ?
The following is the correct mapping you should use (including the index name in the REST PUT command). And the |
character needs to be escaped:
DELETE test1
PUT test1
"settings": {
"analysis": {
"tokenizer": {
"pipe_tokenizer": {
"type": "pattern",
"pattern": "\\|"
"analyzer": {
"pipe_analyzer": {
"type": "custom",
"tokenizer": "pipe_tokenizer"
"mappings": {
"mytype": {
"properties": {
"text": {
"type": "string",
"analyzer": "pipe_analyzer"
POST /test1/mytype/1
GET /test1/_analyze