I am trying to parse some local log file, I am running the ELK Stack on my windows machine. here is an example of the logs I am trying to parse.
2015-12-10 13:50:25,487 [http-nio-8080-exec-26] INFO a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- This Message is OK
2015-12-10 13:50:26,487 [http-nio-8080-exec-26] INFO a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- Journe
y road update: <rows>
<row adi="D" date="2015-12-10" garage="TOP">
<codeNum order="1">TP</codeNum>
<number order="1">1001</number>
<journeystatus code="RT">OnRoute</journeystatus>
</row>
</rows>
the first message works fine through the filters but the second message get split into mutiple messages with _grokparsefailure
in the tags part.
Logstash Config File
input {
file {
path => "C:/data/sampleLogs/temp.log"
type => "testlog"
start_position => "beginning"
}
}
filter {
grok {
# Parse timestamp data. We need the "(?m)" so that grok (Oniguruma internally) correctly parses multi-line events
match => [ "message", [
"(?m)%{TIMESTAMP_ISO8601:logTimestamp}[ ;]\[%{DATA:threadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]+%{JAVACLASS:JavaClass}[ ;]%{SYSLOG5424SD:TransactionID}[ ;]*%{GREEDYDATA:LogMessage}",
"(?m)%{TIMESTAMP_ISO8601:logTimestamp}[ ;]\[%{DATA:threadId}\][ ;]%{LOGLEVEL:logLevel}[ ;]+%{JAVAFILE:JavaClass}[ ;]%{SYSLOG5424SD:TransactionID}[ ;]*%{GREEDYDATA:LogMessage}"
]
]
}
# The timestamp may have commas instead of dots. Convert so as to store everything in the same way
mutate {
gsub => [
# replace all commas with dots
"logTimestamp", ",", "."
]
}
mutate {
gsub => [
# make the logTimestamp sortable. With a space, it is not! This does not work that well, in the end
# but somehow apparently makes things easier for the date filter
"logTimestamp", " ", ";"
]
}
date {
locale => "en"
timezone => "UTC"
match => [ "logTimestamp", "YYYY-MM-dd;HH:mm:ss.SSS" ]
target => "@timestamp"
}
mutate {
add_field => { "debug-timestamp" => "timestampMatched"}
}
}
output {
stdout {
codec => rubydebug
}
}
When I run
bin\logstash agent -f \ELK-Stack\logstash\conf\01_input.conf
in the CMD prompt what is returned is as followed
io/console not supported; tty will not be manipulated
Default settings used: Filter workers: 4
Logstash startup completed
{
"message" => " <row adi=\"D\" date=\"2015-12-10\" garage=\"TOP\"
>\r",
"@version" => "1",
"@timestamp" => "2015-12-11T12:49:34.268Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"tags" => [
[0] "_grokparsefailure"
],
"debug-timestamp" => "timestampMatched"
}
{
"message" => " <codeNum order=\"1\">TP</codeNum>\r",
"@version" => "1",
"@timestamp" => "2015-12-11T12:49:34.268Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"tags" => [
[0] "_grokparsefailure"
],
"debug-timestamp" => "timestampMatched"
}
{
"message" => " <number order=\"1\">1001</number>\r",
"@version" => "1",
"@timestamp" => "2015-12-11T12:49:34.268Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"tags" => [
[0] "_grokparsefailure"
],
"debug-timestamp" => "timestampMatched"
}
{
"message" => " <journeystatus code=\"RT\">OnRoute</journeys
tatus>\r",
"@version" => "1",
"@timestamp" => "2015-12-11T12:49:34.278Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"tags" => [
[0] "_grokparsefailure"
],
"debug-timestamp" => "timestampMatched"
}
{
"message" => " </row>\r",
"@version" => "1",
"@timestamp" => "2015-12-11T12:49:34.278Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"tags" => [
[0] "_grokparsefailure"
],
"debug-timestamp" => "timestampMatched"
}
{
"message" => "y road update: <rows>\r",
"@version" => "1",
"@timestamp" => "2015-12-11T12:49:34.268Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"tags" => [
[0] "_grokparsefailure"
],
"debug-timestamp" => "timestampMatched"
}
{
"message" => "2015-12-10 13:50:25,487 [http-nio-8080-exec-26] INFO
a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- Journe\r",
"@version" => "1",
"@timestamp" => "2015-12-10T13:50:25.487Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"logTimestamp" => "2015-12-10;13:50:25.487",
"threadId" => "http-nio-8080-exec-26",
"logLevel" => "INFO",
"JavaClass" => "a.b.c.v1.myTestClass",
"TransactionID" => "[abcde-1234-12345-b425-12ad]",
"LogMessage" => "- Journe\r",
"debug-timestamp" => "timestampMatched"
}
{
"message" => "</rows>2015-12-10 13:50:25,487 [http-nio-8080-exec-26]
INFO a.b.c.v1.myTestClass [abcde-1234-12345-b425-12ad]- This Message is OK\r",
"@version" => "1",
"@timestamp" => "2015-12-10T13:50:25.487Z",
"host" => "GMAN",
"path" => "C:/data/sampleLogs/temp.log",
"type" => "testlog",
"logTimestamp" => "2015-12-10;13:50:25.487",
"threadId" => "http-nio-8080-exec-26",
"logLevel" => "INFO",
"JavaClass" => "a.b.c.v1.myTestClass",
"TransactionID" => "[abcde-1234-12345-b425-12ad]",
"LogMessage" => "- This Message is OK\r",
"debug-timestamp" => "timestampMatched"
}
I did add mulitline to the top of my filter but it didnt work just gave the following errors and just after the grok.
multiline {
pattern => "^201*-**-**- **:**:"
what => "previous"
negate=> true
}
but this did not help just keeps giving me an error message like
Error: Cannot use more than 1 filter worker because the following plugins don't
work with more than one worker: multiline
You may be interested in the '--configtest' flag which you can
use to validate logstash's configuration before you choose
to restart a running system.
So I try running the --configtest
as suggested and a new error message appears
Error: Cannot use more than 1 filter worker because the following plugins don't
work with more than one worker: multiline
Can anyone help me solve this and get logstash to process the multiple lines.
Your help is greatly appreciated
UPDATE
As @Alain Collins suggested to using codec with multiline, here is what the input of my config looks like.
input {
file {
path => "C:/data/sampleLogs/mulline.log"
codec => multiline {
# Grok pattern names are valid! :)
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
type => "testlog"
start_position => "beginning"
}
}
G
You found the right solution - multiline. The lines need to be joined up into one event.
The multiline filter, as you discovered, is not thread safe, so you can only run one worker thread in that logstash.
There is a multiline codec that might work for you. It assembles the lines as part of the input{} phase and passes one event to the filter{} phase.
Note that you can use logstash pattens with multiline, so "^%{YEAR}" would be better than "^201".
Finally, keep an eye on filebeat, which is the replacement for logstash-forwarder. They say that client-side multiline support is planned, so the message would be sent as one event from the client and not have to be reassembled by logstash.