I have a school project in which I use the ELK stack.
I have lots of data and I want to know which log lines are duplicate and how many duplicates there are for that particular log line based on their log level, server and time range.
I tried this query in which I succesfully extracted the duplicate numbers:
GET /_all/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"beat.hostname": "server-x"
}
},
{
"match": {
"log_level": "WARNING"
}
},{
"range": {
"@timestamp" : {
"gte" : "now-48h",
"lte" : "now"
}
}
}
]
}
},
"aggs": {
"duplicateNames": {
"terms": {
"field": "message_description.keyword",
"min_doc_count": 2,
"size": 10000
}
}
}
}
It successfully gives me the output:
"aggregations" : {
"duplicateNames" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "AuthToken not found [ ]",
"doc_count" : 657
}
]
}
When I try the very same query and only change the log_level
from WARNING
to CRITICAL
it gives me 0 buckets. This is strange cause I can see in Kibana that there are duplicate message_description
field values. Does this has something to do with the .keyword
or maybe the length of the message_description
?
I hope someone can help me with this weird problem.
Edit:
These are two documents that are having exactly the same message_description
, why can't I get the results?
{
"_index" : "filebeat-2019.09.17",
"_type" : "_doc",
"_id" : "yYzDP20BiDGBoVteKHjZ",
"_score" : 10.144365,
"_source" : {
"beat" : {
"name" : "graylog",
"hostname" : "server-x",
"version" : "6.8.2"
},
"message" : """[2019-09-17 17:06:57] request.CRITICAL: Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444 {"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
"@version" : "1",
"source" : "/data/httpd/xxx/xxx/var/log/dev.log",
"tags" : [
"beats_input_codec_plain_applied",
"_grokparsefailure",
"_dateparsefailure"
],
"timestamp" : "2019-09-17 17:06:57",
"input" : {
"type" : "log"
},
"offset" : 54819,
"prospector" : {
"type" : "log"
},
"application" : "request",
"log_level" : "CRITICAL",
"stack_trace" : """{"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
"message_description" : """Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444""",
"@timestamp" : "2019-09-17T15:06:57.436Z",
"host" : {
"name" : "graylog"
},
"log" : {
"file" : {
"path" : "/data/httpd/xxx/xxx/var/log/dev.log"
}
}
}
},
{
"_index" : "filebeat-2019.09.17",
"_type" : "_doc",
"_id" : "CYzDP20BiDGBoVteKHna",
"_score" : 10.144365,
"_source" : {
"beat" : {
"name" : "graylog",
"hostname" : "server-x",
"version" : "6.8.2"
},
"message" : """[2019-09-17 17:06:56] request.CRITICAL: Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444 {"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
"@version" : "1",
"source" : "/data/httpd/xxx/xxx/var/log/dev.log",
"tags" : [
"beats_input_codec_plain_applied",
"_grokparsefailure",
"_dateparsefailure"
],
"timestamp" : "2019-09-17 17:06:56",
"input" : {
"type" : "log"
},
"offset" : 45716,
"prospector" : {
"type" : "log"
},
"application" : "request",
"log_level" : "CRITICAL",
"stack_trace" : """{"exception":"[object] (ErrorException(code: 0): Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php:444)"} []""",
"message_description" : """Uncaught PHP Exception ErrorException: "Warning: include(/data/httpd/xxx/xxx/var/cache/dev/overblog/graphql-bundle/__definitions__/QueryType.php): failed to open stream: No such file or directory" at /data/httpd/xxx/xxx/vendor/composer/ClassLoader.php line 444""",
"@timestamp" : "2019-09-17T15:06:57.426Z",
"host" : {
"name" : "graylog"
},
"log" : {
"file" : {
"path" : "/data/httpd/xxx/xxx/var/log/dev.log"
}
}
}
}
What happens is that the message_description
field is longer than 256 characters and thus gets ignored. Run GET filebeat-2019.09.17
to confirm this.
What you can do is augment that limit by modifying the mapping of the field like this:
PUT filebeat-*/_doc/_mapping
{
"properties": {
"message_description": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 500
}
}
}
}
}
And then update all the data present in those indexes:
POST filebeat-*/_update_by_query
Once that's done, your query will magically work again ;-)