I'm using django_elasticsearch_dsl.
My Document:
html_strip = analyzer(
'html_strip',
tokenizer='standard',
filter=["lowercase", "stop", "snowball"],
char_filter=["html_strip"]
)
class Document(django_elasticsearch_dsl.Document):
name = TextField(
analyzer=html_strip,
fields={
'raw': fields.KeywordField(),
'suggest': fields.CompletionField(),
}
)
...
My request:
_search = Document.search().suggest("suggestions", text=query, completion={'field': 'name.suggest'}).execute()
I have the following document "names" indexed:
"This is a test"
"this is my test"
"this test"
"Test this"
Now if search for This is my text
if will receive only
"this is my text"
However, if I search for test
, then all I get is
"Test this"
Even though I want all documents, that have test
in their name.
What am I missing?
Based on the comment given by the user, adding another answer using ngrams
Adding a working example with index mapping, index data, search query, and search result
Index Mapping:
{
"settings": {
"analysis": {
"filter": {
"ngram_filter": {
"type": "ngram",
"min_gram": 4,
"max_gram": 20
}
},
"analyzer": {
"ngram_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"ngram_filter"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "ngram_analyzer",
"search_analyzer": "standard"
}
}
}
}
Index Data:
{
"name": [
"Test this"
]
}
{
"name": [
"This is a test"
]
}
{
"name": [
"this is my test"
]
}
{
"name": [
"this test"
]
}
Analyze API:
POST/_analyze
{
"analyzer" : "ngram_analyzer",
"text" : "this is my test"
}
The following tokens are generated:
{
"tokens": [
{
"token": "this",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "test",
"start_offset": 11,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 3
}
]
}
Search Query:
{
"query": {
"match": {
"name": "test"
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64281341",
"_type": "_doc",
"_id": "4",
"_score": 0.2876821,
"_source": {
"name": [
"Test this"
]
}
},
{
"_index": "stof_64281341",
"_type": "_doc",
"_id": "3",
"_score": 0.2876821,
"_source": {
"name": [
"this is my test"
]
}
},
{
"_index": "stof_64281341",
"_type": "_doc",
"_id": "2",
"_score": 0.2876821,
"_source": {
"name": [
"This is a test"
]
}
},
{
"_index": "stof_64281341",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"name": [
"this test"
]
}
}
]
For fuzzy search you can use the below search query:
{
"query": {
"fuzzy": {
"name": {
"value": "tst" <-- used tst in place of test
}
}
}
}