Search code examples
regexlogstash-grokgrok

Parsing data with grok filter on logstash


I'm having problems using grok filter on logstash. I have this log:

83.149.9.216 - - [04/Jan/2015:05:13:42 +0000]

And I want to parse the IP and the date. I have the code below but I'm getting no matches.

^%{IPV4:req_id} - - \[(?<date>%{DAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND} +0000)]

What am I doing wrong? Thanks!


Solution

  • You should change %{DAY} (=day of the week name) to %{MONTHDAY} (to match the numbers) and escape the + to match it as a literal + char:

    ^%{IPV4:req_id} - - \[(?<date>%{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND} \+0000)]
                                  ^^^^^^^^^^^                                              ^
    

    As suggested by Calvin Taylor, you may further enhance the pattern to match any ISO8601 time zone with %{ISO8601_TIMEZONE} instead of \+0000:

    ^%{IPV4:req_id} - - \[(?<date>%{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND} %{ISO8601_TIMEZONE})]
                                                                                           ^^^^^^^^^^^^^^^^^^^
    

    See Grok patterns:

    MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
    DAY (?:Mon(?:day)?|Tue(?:sday)?|Wed(?:nesday)?|Thu(?:rsday)?|Fri(?:day)?|Sat(?:urday)?|Sun(?:day)?)