I'm trying to create a grok pattern for the following formats:
October 27, 2015 03:44: lorem created a new project "lorem / ipsum"
October 27, 2015 03:48: lorem created a new project "lorem / ipsum-cp"
October 27, 2015 18:38: john created a new project "john / playgroud"
October 27, 2015 18:42: joseph created a new project "joseph / test-ci"
I couldn't find a single expression to match the full date, so I did the following:
grok {
match => { "message" => "%{MONTH:month}%{SPACE}%{NUMBER:day}, %{YEAR:year}%{SPACE}%{HOUR:hour} %{NUMBER:minute}"}
}
Thus creating a set of fields for all of the parts of the datetime stamp. Now I was wandering what would be the best way to deal with the rest of the line and to recreate a timestamp.
I was considering using a mutate to join all of the fields together and parse it through the date filter but should I rewrite the message
parameter to only the rest of the line? Like lorem created a new project "lorem / ipsum"
or leave it untouched to reflect the original line?
To put the rest of the line into a field, use GREEDYDATA at the end of your pattern:
%{GREEDYDATA:remainder}
Since I'm putting the leading data into a new field, I'll usually put the remainder back into the 'message' field:
%{GREEDYDATA:message}
which also requires the 'overwrite' parameter to be set on the grok{}.
There are a couple of ways to get a single date. One in, as you suggested, combining them in logstash:
mutate {
add_field => {
"myDateField" => "%{myMonth} %{myDay} %{myYear}"
}
}
Then you'd need a matching pattern for the date{} filter.
If you only want the one date field, then there's little reason to make all the little fields (month, day, year). Use a grok pattern that pulls everything you want into one field:
^(?<myDateField>[^:]+):
("from the beginning of the line, everything that's not a colon goes into a field called myDateField")
Another comment: if you always have a single space between patterns, don't use %{SPACE}. This is easier to read:
%{YEAR:year} %{HOUR:hour}
Though if you might have multiple spaces, or other types of whitespace, then do use %{SPACE}.