Search code examples
loggingtimestamplogstash-grok

GROK Pattern for following log


I'm having difficulty coming up with a pattern for the following log entry.

[INFO ] 2020-02-07 16:11:56.148 [localhost-startStop-1] DOMUtilities - System property DocumentBuilderCacheBlockSize is not defined, using default 25

The following is what I have.

  %{LOGLEVEL:loglevel} %{YEAR} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND}[%{DATA:threadName}\]\s+\%{DATA:javafile}\s[-:]\s+%{GREEDYDATA:message}

Can anyone fill me in on what I am doing wrong please? I know the issue lies around the date format but I just cannot find the answer.


Solution

  • Your grok-pattern has multiple issues (order is based on occurrence in grok pattern):

    • The log-level is inside square brackets. The LOGLEVEL-pattern does not include any characters besides the defined words. Also, there is a whitespace after the loglevel and before the closing square bracket
    • Your the date-values of your timestamp (year, month, day) are separated by hyphens which you haven't specified at all.
    • The MONTH-pattern is for the full and/or abbreviated month names, e.g. Feb or February, Instead you need to use the MONTHNUM2-pattern which uses numbers.
    • There is whitespace between the seconds and the thread-name
    • You need to escape the opening square bracket at [%{DATA:threadName}\] since its a special character in regex
    • There is no need for a backslash before %{DATA:javafile}

    Please take a more detailed look at the logstash grok-patterns and their definitions.

    With the example log you've provided I came up with the following pattern:

    ^\[%{LOGLEVEL:loglevel}\s?+\]\s+%{YEAR}-%{MONTHNUM2}-%{MONTHDAY}\s+%{HOUR}:%{MINUTE}:%{SECOND}\s+\[%{DATA:threadName}\]\s+%{DATA:javafile}\s[-:]\s+%{GREEDYDATA:message}

    You can verify your patterns on this page.

    I hope I could help you.