My tokenizer
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 10,
"token_chars": [
"letter",
"digit"
]
}
I am trying to search the value based on this fields but the prob here is whenever, I want to search on the basis of token like suppose If I search with s token then I should get items matching or starting to s , now If i search with sp I want to get item starting from sp discarding other things , I just want to get the value starting with sp and discard all , I am not getting is my query wrong or filter I have used thats wrong can someone pls help me with this
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "PRODUCT",
"fields": [
"item",
"data1"
]
}
},
{
"multi_match": {
"query": "SUB_FAMILY",
"fields": [
"item",
"data1"
]
}
},
{
"match": {
"values": "SP"
}
}
]
}
}
}
The output for this query is
"hits": [
{
"_index": "logs_datas",
"_type": "_doc",
"_id": "H1PfEnkBQXpKNrJSp8bV",
"_score": 9.418445,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
"path": "/home/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.578Z",
"host": "ewiglp71",
"item_pk": "SPRINHO2H",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "logs_datas",
"_type": "_doc",
"_id": "y1PfEnkBQXpKNrJSp8XQ",
"_score": 5.3059187,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SCMLPLWVI",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.577Z",
"host": "ewiglp71",
"item_pk": "SCMLPLWVI",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "logs_datas",
"_type": "_doc",
"_id": "zFPfEnkBQXpKNrJSp8XQ",
"_score": 5.3059187,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SSVRKEN2Z",
"path": "/home/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.579Z",
"host": "ewiglp71",
"item_pk": "SSVRKEN2Z",
"attribute_name": "SUB_FAMILY"
}
}
}
]
}
}
Since the min_gram
is 1, so the tokens generated for SCMLPLWVI
will be
{
"tokens": [
{
"token": "S",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
},
{
"token": "SC",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "SCM",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "SCML",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 3
},
{
"token": "SCMLP",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 4
},
{
"token": "SCMLPL",
"start_offset": 0,
"end_offset": 6,
"type": "word",
"position": 5
},
{
"token": "SCMLPLW",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 6
},
{
"token": "SCMLPLWV",
"start_offset": 0,
"end_offset": 8,
"type": "word",
"position": 7
},
{
"token": "SCMLPLWVI",
"start_offset": 0,
"end_offset": 9,
"type": "word",
"position": 8
}
]
}
If you want to get the value starting with sp
then you need to modify your tokenizer as
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 2, // note this
"max_gram": 10,
"token_chars": [
"letter",
"digit"
]
}
Update 1:
You can use a match_bool_prefix to search for words starting with s
or sp
Adding a working example
Index Mapping:
{
"mappings": {
"properties": {
"item_pk": {
"type": "text"
}
}
}
}
Search Query 1:
{
"query": {
"match_bool_prefix" : {
"item_pk" : "s"
}
}
}
Search Result will be
"hits": [
{
"_index": "67281810",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.578Z",
"host": "ewiglp71",
"item_pk": "SPRINHO2H",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "67281810",
"_type": "_doc",
"_id": "i7quE3kB6jKCA-nFYii6",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SCMLPLWVI",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.577Z",
"host": "ewiglp71",
"item_pk": "SCMLPLWVI",
"attribute_name": "SUB_FAMILY"
}
},
{
"_index": "67281810",
"_type": "_doc",
"_id": "jLquE3kB6jKCA-nFgiju",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SSVRKEN2Z",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.579Z",
"host": "ewiglp71",
"item_pk": "SSVRKEN2Z",
"attribute_name": "SUB_FAMILY"
}
}
]
Search Query 2:
{
"query": {
"match_bool_prefix" : {
"item_pk" : "sp"
}
}
}
Search Result:
"hits": [
{
"_index": "67281810",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"message": "PRODUCT,SUB_FAMILY,SPRINHO2H",
"path": "/home/niteshb/elasticsearchDatas.csv",
"hierarchy_name": "PRODUCT",
"@version": "1",
"@timestamp": "2021-04-27T10:28:37.578Z",
"host": "ewiglp71",
"item_pk": "SPRINHO2H",
"attribute_name": "SUB_FAMILY"
}
}
]
Update 2:
Try with this query
{
"query": {
"bool": {
"must": [
{
"match": {
"hierarchy_name": "PRODUCT"
}
},
{
"match": {
"attribute_name": "SUB_FAMILY"
}
},
{
"match_bool_prefix": {
"item_pk": "sp"
}
}
]
}
}
}