I'm new to elasticsearch. I would write a query, that matches a document with the following attribute:
description: 20*40x555
I want the query matches with the following inputs:
20X40X555
20*40*555
20x40x555
I tired the wildcard, and the regex query also. If the document's attribute does not contains any asterix characters the query works fine, but when the description attribute contains an *
character it does not find.
I tried with the following queries:
with wildcard:
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"short_description": "*20?40?555*"
}
}
]
}
}
}
with regexp:
{
"query": {
"bool": {
"must": [
{
"regexp": {
"short_description": ".*20.*40.*555.*"
}
}
]
}
}
}
When the short attribute is ex 20x40x555
both of these working, but if I change the value to 20*40x555
it does not returns the document unfortunately.
How can I achieve to get results when the value of the sort_description is eg. 20*40x555
? Thanks!
Assuming that you are using standard analyzer to index the "*" is no longer present in the index. The text 20*40x555
is indexed as two tokens 20
and 40x555
. The simplest way to make *
interchangeable with x
is to replace it with x
during the indexing operations by using patter_replace
filter. Here is a simple example that illustrates this idea:
DELETE test
PUT test
{
"settings": {
"analysis": {
"char_filter": {
"asterisk_to_x": {
"type": "pattern_replace",
"pattern": "(\\S)\\*(\\S)",
"replacement": "$1x$2"
}
},
"analyzer": {
"custom_analyzer": {
"tokenizer": "keyword",
"char_filter": ["asterisk_to_x"],
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"properties": {
"short_description": {
"type": "text",
"analyzer": "custom_analyzer"
}
}
}
}
POST test/_bulk?refresh
{ "index": { "_id": "1" } }
{"short_description":"20x40*555"}
POST test/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"short_description": "20X40X555"
}
}
]
}
}
}