I have this kind of log:
2020-09-02 14:29:22,854 [http-something] [ERROR] JavaClass(JavaLine) - [6652942]: Error message with no stack trace
2020-09-02 14:29:08,976 [http-something] [INFO] JavaClass(JavaLine) - [6791732]: Some message
2020-09-02 14:29:09,116 [http-something] [ERROR] JavaClass(JavaLine) - [6791732]: Error message with stack trace
JavaException: This is not going well
at JavaClass
at JavaClass
at JavaClass
at JavaClass
at JavaClass
Caused by: JavaClass: This is a problem
at JavaClass
at JavaClass
at JavaClass
at JavaClass
... 48 more
and I use this filter to have a more readable log on Kibana:
filter {
# INFO and ERROR
grok {
tag_on_failure => ["_stackTraceFailure"]
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}%{SPACE}(\[%{DATA:thread}\])?%{SPACE}\[%{LOGLEVEL:log_level}\]%{SPACE}%{GREEDYDATA}%{SPACE}\-%{SPACE}%{GREEDYDATA:action}" }
overwrite => [ "message" ]
}
# JAVA ERROR
if ("_stackTraceFailure" in [tags]) {
grok {
tag_on_failure => ["_grokParseFailure"]
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}%{SPACE}(\[%{DATA:thread}\])?%{SPACE}\[%{LOGLEVEL:log_level}\]%{SPACE}%{GREEDYDATA}%{SPACE}\-%{SPACE}%{DATA:issue}(\r|\n)+(?m)%{GREEDYDATA:stack-trace}" }
overwrite => [ "message" ]
remove_tag => "_stackTraceFailure"
}
}
}
The problem is that the first pattern is matching everything, putting all the stack trace (when there is one) in the action tag and resulting in the second pattern to never be used. I know this problem is caused by GREEDYDATA but I'm not really skilled with regex and I'm not finding a solution to do what I want.
I don't want to swap the position of the patterns because INFO and ERROR (without stack trace) are way more common so I need a way to make the first one fail in the case of a multiline log or anything that will make the first one to fail if there is some sort of stack trace. Ho can I do that starting from what I have done so far?
You need to use conditionals before your groks. You can use a conditional to filter the entire message and use your two different grok
filters, or you can keep your first grok
filter as the same and use a conditional to parse only the action
field, I would suggest the second option.
In both cases you need your conditional to filter based on something that only exists in your multiline message, in this case could be the "at JavaClass"
string.
So you would need something like this:
if "at JavaClass" not in [message] {
grok { your first grok }
} else {
grok { your second grok }
}
If you want to keep your first grok and use a second one to parse only the action field, it would be something like this.
if "at JavaClass" in [action] {
grok {
tag_on_failure => ["_grokParseFailure"]
match => { "action" => "%{DATA:issue}(\r|\n)+(?m)%{GREEDYDATA:stack-trace}" }
}
}
You didn't say how you are collecting your logs, if you are using filebeat or logstash with the multiline
coded in the input, you also could filter based on the tags, since you would have a tag named multiline
for your logs.