Using the grokdebugger I've adapted what I found over the Internet for my first attempt to handle logback spring-boot kind of logs.
Here is a log entry sent to grokdebugger:
2022-03-09 06:35:15,821 [http-nio-9090-exec-1] WARN org.springdoc.core.OpenAPIService - found more than one OpenAPIDefinition class. springdoc-openapi will be using the first one found.
with the grok pattern:
(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) \[(?<thread>(.*?)+)\] %{LOGLEVEL:level}\s+%{GREEDYDATA:class} - (?<logmessage>.*)
and its dispatches its content as wished:
{
"timestamp": [
[
"2022-03-09 06:35:15,821"
]
],
"YEAR": [
[
"2022"
]
],
"MONTHNUM": [
[
"03"
]
],
"MONTHDAY": [
[
"09"
]
],
"TIME": [
[
"06:35:15,821"
]
],
"HOUR": [
[
"06"
]
],
"MINUTE": [
[
"35"
]
],
"SECOND": [
[
"15,821"
]
],
"thread": [
[
"http-nio-9090-exec-1"
]
],
"level": [
[
"WARN"
]
],
"class": [
[
"org.springdoc.core.OpenAPIService"
]
],
"logmessage": [
[
"found more than one OpenAPIDefinition class. springdoc-openapi will be using the first one found."
]
]
}
But when I ask for the same action inside logstash, I set in configuration for input
declaration:
input {
file {
path => "/home/lebihan/dev/Java/comptes-france/metier-et-gestion/dev/ApplicationMetierEtGestion/sparkMetier.log"
codec => multiline {
pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
negate => "true"
what => "previous"
}
}
}
and for filter
declaration:
filter {
#If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
if [message] =~ "\tat" {
grok {
match => ["message", "^(\tat)"]
add_tag => ["stacktrace"]
}
}
grok {
match => [ "message",
"(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) \[(?<thread>(.*?)+)\] %{LOGLEVEL:level}\s+%{GREEDYDATA:class} - (?<logmessage>.*)"
]
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ]
}
}
But it fails in parsing it, and I don't know how to have extra content about the underlying error mentioned by _grokparsefailure
.
The main responsible of my trouble is the:
grok {
match => [
instead of:
grok {
match => {
But after that, I had to change:
%{TIMESTAMP_ISO8601:timestamp}
to avoid a _dateparsefailure
.
@timestamp:
Mar 16, 2022 @ 09:14:22.002
@version:
1
class:
f.e.service.AbstractSparkDataset
host:
debian
level:
INFO
logmessage:
Un dataset a été sauvegardé dans le fichier parquet /data/tmp/balanceComptesCommunes_2019_2019.
thread:
http-nio-9090-exec-10
timestamp:
2022-03-16T06:34:09.394Z
_id:
8R_KkX8BBIYNTaMw1Jfg
_index:
ecoemploimetier-2022.03.16
_score:
-
_type:
_doc
I eventually corrected my logstash config file like that:
input {
file {
path => "/home/[...]/myLog.log"
sincedb_path => "/dev/null"
start_position => "beginning"
codec => multiline {
pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}.*"
negate => "true"
what => "previous"
}
}
}
filter {
#If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
if [message] =~ "\tat" {
grok {
match => ["message", "^(\tat)"]
add_tag => ["stacktrace"]
}
}
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} \[(?<thread>(.*?)+)\] %{LOGLEVEL:level} %{GREEDYDATA:class} - (?<logmessage>.*)" }
}
date {
# 2022-03-16 07:32:24,860
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
target => "timestamp"
}
# S'il n'y a pas d'erreur de parsing, supprimer le message d'origine, non parsé
if "_grokparsefailure" not in [tags] {
mutate {
remove_field => [ "message", "path" ]
}
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["localhost:9200"]
index => "ecoemploimetier-%{+YYYY.MM.dd}"
}
}