My Partial mappings and queries work great until a space is involved, For Example the term Jon Doe breaks down its term vector to ..
"terms": {
"j": {
"term_freq": 1
},
"jo": {
"term_freq": 1
},
"jon": {
"term_freq": 1
},
"d": {
"term_freq": 1
},
"do": {
"term_freq": 1
},
"doe": {
"term_freq": 1
}
}
But I would like it to be ..
"terms": {
"j": {
"term_freq": 1
},
"jo": {
"term_freq": 1
},
"jon": {
"term_freq": 1
},
"jon ": {
"term_freq": 1
},
"jon d": {
"term_freq": 1
},
"jon do": {
"term_freq": 1
},
"jon doe": {
"term_freq": 1
}
}
Here are my mappings and settings:
Mappings:
name: {
type: 'string',
term_vector: 'yes',
analyzer: 'ngram_analyzer',
search_analyzer: 'standard',
include_in_all: true
}
Settings:
settings: {
index: {
analysis: {
filter: {
ngram_filter: {
type: 'edge_ngram',
min_gram: 1,
max_gram: 15
}
},
analyzer: {
'ngram_analyzer': {
filter: [
'lowercase',
'ngram_filter'
],
type: 'custom',
tokenizer: 'standard'
}
}
},
number_of_shards: 1,
number_of_replicas: 1
}
}
};
How would I go about this?
You just need to use a different tokenizer in your custom analyzer:
"analyzer": {
"ngram_analyzer": {
"filter": [
"lowercase",
"ngram_filter"
],
"type": "custom",
"tokenizer": "keyword"
}
}