Search code examples
elasticsearch

regex does not work when used in completion-suggester


I want to look up all suggestions that start with "iphone 5" and ignore case. I don't understand why this expression /iphone 5.*/gi can't match to any data.I've verified on the regex101 that it works.

doc mappings:

{
    "properties": {
        "_class": {
            "type": "keyword",
            "index": false,
            "doc_values": false
        },
        "id": {
            "type": "keyword"
        },
        "suggest": {
            "type": "completion",
            "analyzer": "simple",
            "preserve_separators": true,
            "preserve_position_increments": true,
            "max_input_length": 50,
            "contexts": [
                {
                    "name": "context",
                    "type": "CATEGORY"
                }
            ]
        }
    }
}

data example:

{
    "_index": "product_name_suggest",
    "_id": "c349997fc3774baf902e2def411c8616ar",
    "_score": 1.0,
    "_source": {
        "_class": "com.baoyihui.screw.model.elasticsearch.entity.ProductNameSuggest",
        "id": "c349997fc3774baf902e2def411c8616ar",
        "suggest": {
            "input": [
                "Samsung Galaxy Note Galaxy Note 3 Neo SM-N7505L",
                "Galaxy Note Galaxy Note 3 Neo SM-N7505L",
                "Note Galaxy Note 3 Neo SM-N7505L",
                "Galaxy Note 3 Neo SM-N7505L",
                "Note 3 Neo SM-N7505L",
                "3 Neo SM-N7505L",
                "Neo SM-N7505L",
                "SM-N7505L"
            ],
            "contexts": {
                "context": [
                    "ar_machine_CompleteMachineDoc"
                ]
            }
        }
    }
}

request data of http://127.0.0.1:9200/product_name_suggest/_search:

{
    "suggest": {
        "suggest": {
            "regex": "/iphone 5.*/gi",
            "completion": {
                "field": "suggest",
                "size": 20,
                "contexts": {
                    "context": [
                        {
                            "context": "zh_battery_SparePartsDoc"
                        }
                    ]
                },
                "regex": {
                    "flags": "ALL"
                },
                "skip_duplicates": true
            }
        }
    }
}

resposne:

{
    "took": 0,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "suggest": {
        "suggest": [
            {
                "text": "/iphone 5.*/gi",
                "offset": 0,
                "length": 14,
                "options": []
            }
        ]
    }
}

I'm sure there is data that matches. Because of the data returned by the interface when I replace "regex": "/iphone 5.*/gi" with "prefix": "iphone 5".

{
    "text": "iPhone 11 A2221 电池",
    "_index": "product_name_suggest",
    "_id": "095539e21a3d4bdea80d8dacda5e9a2czh",
    "_score": 1.0,
    "_source": {
        "_class": "com.baoyihui.screw.model.elasticsearch.entity.ProductNameSuggest",
        "id": "095539e21a3d4bdea80d8dacda5e9a2czh",
        "suggest": {
            "input": [
                "苹果 iPhone 11 A2221 电池",
                "iPhone 11 A2221 电池",
                "11 A2221 电池",
                "A2221 电池",
                "电池"
            ],
            "contexts": {
                "context": [
                    "zh_battery_SparePartsDoc"
                ]
            }
        }
    },
    "contexts": {
        "context": [
            "zh_battery_SparePartsDoc"
        ]
    }
}

But as you can see, after replacing it with "prefix": "iphone 5", it returns irrelevant data, which is why I wanted to use a regex.


Solution

  • When looking at the Elasticsearch documentation for regexp suggesters(https://www.elastic.co/guide/en/elasticsearch/reference/8.11/query-dsl-regexp-query.html and https://www.elastic.co/guide/en/elasticsearch/reference/8.11/regexp-syntax.html) it seems that your regex syntax is not what Elasticsearch expects.