Search code examples
logstashlogstash-groklogstash-configuration

Grok recreate timestamp and message


I'm trying to create a grok pattern for the following formats:

October 27, 2015 03:44: lorem created a new project "lorem / ipsum"
October 27, 2015 03:48: lorem created a new project "lorem / ipsum-cp"
October 27, 2015 18:38: john created a new project "john / playgroud"
October 27, 2015 18:42: joseph created a new project "joseph / test-ci"

I couldn't find a single expression to match the full date, so I did the following:

grok {
    match => { "message" => "%{MONTH:month}%{SPACE}%{NUMBER:day}, %{YEAR:year}%{SPACE}%{HOUR:hour} %{NUMBER:minute}"}
}

Thus creating a set of fields for all of the parts of the datetime stamp. Now I was wandering what would be the best way to deal with the rest of the line and to recreate a timestamp.

I was considering using a mutate to join all of the fields together and parse it through the date filter but should I rewrite the message parameter to only the rest of the line? Like lorem created a new project "lorem / ipsum" or leave it untouched to reflect the original line?


Solution

  • To put the rest of the line into a field, use GREEDYDATA at the end of your pattern:

     %{GREEDYDATA:remainder}
    

    Since I'm putting the leading data into a new field, I'll usually put the remainder back into the 'message' field:

     %{GREEDYDATA:message}
    

    which also requires the 'overwrite' parameter to be set on the grok{}.

    There are a couple of ways to get a single date. One in, as you suggested, combining them in logstash:

    mutate {
       add_field => {
          "myDateField" => "%{myMonth} %{myDay} %{myYear}"
       }
    }
    

    Then you'd need a matching pattern for the date{} filter.

    If you only want the one date field, then there's little reason to make all the little fields (month, day, year). Use a grok pattern that pulls everything you want into one field:

    ^(?<myDateField>[^:]+):
    

    ("from the beginning of the line, everything that's not a colon goes into a field called myDateField")

    Another comment: if you always have a single space between patterns, don't use %{SPACE}. This is easier to read:

    %{YEAR:year} %{HOUR:hour}
    

    Though if you might have multiple spaces, or other types of whitespace, then do use %{SPACE}.