Search code examples
logginglogstashlogstash-grok

logstash: grok parse failure


I have this config file

input {
  stdin {}
   file {
    type => "txt"
    path => "C:\Users\Gck\Desktop\logsatash_practice\input.txt"
    start_position=>"beginning"
  }
}


filter {
    grok {
        match => [ "message", "%{DATE:timestamp} %{IP:client} %{WORD:method} %{WORD:text}"]
      }
    date {
        match => [ "timestamp", "MMM-dd-YYYY-HH:mm:ss" ]
        locale => "en"
    }
}

output {
    file {
        path => "C:\Users\Gck\Desktop\logsatash_practice\op\output3.txt"
    }
}

and lets say this is my input:

MAY-08-2015-08:00:00 55.3.244.1 GET hello

MAY-13-2015-13:00:00 56.4.245.2 GET world

After running it, I get a message of: grokparse failure.

this is the output:

{"message":"MAY-08-2015-08:00:00\t55.3.244.1\thello\r","@version":"1","@timestamp":"2015-05-11T12:51:05.268Z","type":"txt","host":"user-PC","path":"C:\Users\Gck\Desktop\logsatash_practice\input.txt","tags":["_grokparsefailure"]}

{"message":"MAY-13-2015-13:00:00\t56.4.245.2\tworld\r","@version":"1","@timestamp":"2015-05-11T12:51:05.269Z","type":"txt","host":"user-PC","path":"C:\Users\Gck\Desktop\logsatash_practice\input.txt","tags":["_grokparsefailure"]}

What do I do wrong?

Not less important- is there any guide that sums up this filtering thing in a good clear way? elastic guides aren't detailed enough.


Solution

  • The DATE grok pattern is defined like this:

    DATE %{DATE_US}|%{DATE_EU}
    

    DATE_US and DATE_EU are in turned defined like this:

    DATE_US %{MONTHNUM}[/-]%{MONTHDAY}[/-]%{YEAR}
    DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}
    

    I could continue, but it's already clear that this doesn't match the actual content of your log message sample:

    MAY-08-2015-08:00:00 55.3.244.1 GET hello
    

    There's no stock grok pattern that matches this date format but it's easy to put together a custom one. Also, note that the separator between the tokens in your log messages aren't spaces but tabs. We can use \s to match any whitespace character. Working example:

    (?<timestamp>%{WORD}-%{MONTHDAY}-%{YEAR}-%{TIME})\s%{IP:client}\s%{WORD:method}\s%{WORD:text}
    

    Not less important- is there any guide that sums up this filtering thing in a good clear way? elastic guides aren't detailed enough.

    With the exception of the grok-specific %{PATTERN_NAME:variable} notation this is all just plain regular expressions, and there are many introductory guides for those elsewhere.