I am trying to use the regex Parser Plugin in fluentd to index the logs of my application.
Here's a snippet of it.
2020-05-06T22:34:50.860-0700 - WARN [main] o.s.b.GenericTypeAwarePropertyDescriptor: Invalid JavaBean property 'pipeline' being accessed! Ambiguous write methods found next to actually used [public void com.theoaal.module.pipeline.mbean.DynamicPhaseExecutionConfigurationMBeanBuilder.setPipeline(com.theplatform.module.pipeline.DynamicPipeline)]: [public void com.theplatform.module.pipeline.mbean.PhaseExecutionConfigurationMBeanBuilder.setPipeline(com.theoaal.module.pipeline.Pipeline)]
I have used the regex101.com to match the regex pattern and I am not able to get a match.
^(?<date>\d{4}\-\d{2}\-\d{2})(?<timestamp>[A-Z][a-z]{1}\d{2}:\d{2}:\d{2}.\d{3}\-\d{4})\s\-\s(?<loglevel>\[\w\]{6})\s+(?<class>\[[A-Z][a-z]+\])\s(?<message>.*)$
Kindly help. Thanks
You may use
^(?<date>\d{4}-\d{2}-\d{2})[A-Z](?<timestamp>\d{2}:\d{2}:\d{2}\.\d{3}-\d{4})\s+-\s+(?<loglevel>\w+)\s+(?<class>\[\w+\])\s+(?<message>.*)
See the regex demo
Note, in your pattern, \[\w\]{6}
only matches [
, a single word char and six ]
chars. In the timestamp pattern, [A-Z][a-z]{1}
requires two letters, but tere is a single T
. Your "class" pattern requires a capitalized word with [A-Z][a-z]+
, but main
is all lowercase. You escape -
outside of character classes unnecessarily, and you failed to escape a literal dot in the pattern.
Details
^
- start of string(?<date>\d{4}-\d{2}-\d{2})
- date: 4 digits, -
, 2 digits, -
, 2 digits[A-Z]
- an uppercase ASCII letter(?<timestamp>\d{2}:\d{2}:\d{2}\.\d{3}-\d{4})
- 2 digits, :
, 2 digits, :
, 2 digits, .
, 3 digits, -
and 4 digits\s+-\s+
- -
enclosed with 1+ whitespaces(?<loglevel>\w+)
- 1+ word chars\s+
- 1+ whitespaces(?<class>\[\w+\])
- [
, 1+ word chars, ]
\s+
- 1+ whitespaces(?<message>.*)
- the res of the line.Copy and paste to fluent.conf
or td-agent.conf
:
<source>
type tail
path /var/log/foo/bar.log
pos_file /var/log/td-agent/foo-bar.log.pos
tag foo.bar
format /^(?<date>\d{4}-\d{2}-\d{2})[A-Z](?<timestamp>\d{2}:\d{2}:\d{2}\.\d{3}-\d{4})\s+-\s+(?<loglevel>\w+)\s+(?<class>\[\w+\])\s+(?<message>.*)/
</source>
Test: