This is the sample log pattern I'm parsing. I'm using grok but it's not exactly as what I expected
180528 8:46:26 2 Query SELECT 1
To parse this log my grok pattern is
%{NUMBER:date} %{NOTSPACE:time}%{INT:pid}%{GREEDYDATA:message}
and output for this in grok debugger is
> { "date": [
> [
> "180528"
> ] ], "time": [
> [
> "8:46:2"
> ] ], "pid": [
> [
> "6"
> ] ], "message": [
> [
> " 2 Query\tSELECT 1"
> ] ] }
If you observe in the output, pid is being extracted from time and actual pid which is 2 is being merged in the message. Not sure what went wrong here.
Why can't you just match your time with TIME
pattern instead? it doesn't make sense to match it with NOTSPACE
which equals to \S+
, and matches any non-whitespace character (equal to [^\r\n\t\f\v ]
)
You can use TIME
pattern for your time value and INT
for pid
as follows,
%{NUMBER:date}\s%{TIME:time}\s%{INT:pid}\s%{GREEDYDATA:message}
This will give you,
{
"date": [
[
"180528"
]
],
"BASE10NUM": [
[
"180528"
]
],
"time": [
[
"8:46:26"
]
],
"HOUR": [
[
"8"
]
],
"MINUTE": [
[
"46"
]
],
"SECOND": [
[
"26"
]
],
"pid": [
[
"2"
]
],
"message": [
[
"Query SELECT 1"
]
]
}