I am trying to find out data matching emails from a message field in ELK Kibana
discover section, I am getting the results using:
@message:"abc@email.com"
However, the results produced contains some other messages where email should not be matched, I am unable to build solution for this.
Results are(data has been sanitized for security reasons):
@message:[INF] [2020-07-07 12:54:51.105] [PID-1] : [abcdefg] [JID-5c] [data] LIST_LOOKUP: abc@email.com | User List from Profiles | name | user_name @id:355502086986714
@message:[INF] [2020-07-07 12:38:36.755] [PID-2] : [abcdefg] [JID-ed2] [data] LIST_LOOKUP: abc@email.com | User List from Profiles | name | user_name @id:355501869671304
@message:[INF] [2020-07-07 12:19:48.141] [PID-3] [abc@email.com] : [c5] [data] Completed 200 OK in 11ms @id:355501617979964834
@message:[INF] [2020-07-07 11:19:48.930] [PID-5] [abc@email.com] : [542] [data] Completed 200 OK in 9ms @id:35550081535
while I want it to be:
@message:[INF] [2020-07-07 12:19:48.141] [PID-3] [abc@email.com] : [c5] [data] Completed 200 OK in 11ms @id:355501617979964834
@message:[INF] [2020-07-07 11:19:48.930] [PID-5] [abc@email.com] : [542] [data] Completed 200 OK in 9ms @id:35550081535
I've tried using @message: "[PID-*] [abc@email.com]"
,@message: "\[PID-*\] \[abc@email.com\] \:"
, @message: "[abc@email.com]"
, @message: *abc@email.com*
and some more similar searches to no success.
Please let me know what I am missing here and how to make efficient subtext searches in ELK kibana using discover and KQL
/Lucene
.
Here is the mapping for my index(I am getting data from cloudwatch logs):
{
"cwl-*":{
"mappings":{
"properties":{
"@id":{
"type":"string"
},
"@log_stream":{
"type":"string"
},
"@log_group":{
"type":"string"
},
"@message":{
"type":"string"
},
"@owner":{
"type":"string"
},
"@timestamp":{
"type":"date"
}
}
}
}
}
As @Gibbs already mentioned the cause all your data contains
the string abc@email.com
and by seeing your mapping now its confirmed that your are using the string
field without explicit analyzer will uses the default standard analyzer
Instead of this you should map your field which gets the mail id to custom analyzer which uses the UAX URL Email tokenizer which doesn't split the text.
Example on how to create this analyzer with example
Mapping with custom email analyzer
{
"settings": {
"analysis": {
"analyzer": {
"email_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "uax_url_email"
}
}
}
},
"mappings": {
"properties": {
"email": {
"type": "text",
"analyzer": "email_analyzer"
}
}
}
}
Analyze api response
POST http://{{hostname}}:{{port}}/{{index-name}}/_analyze
{
"analyzer": "email_analyzer",
"text": "abc@email.com"
}
{
"tokens": [
{
"token": "abc@email.com",
"start_offset": 0,
"end_offset": 13,
"type": "<EMAIL>",
"position": 0
}
]
}