I have following filter which achieves most of my needs:
filter {grok {
match => { "message" => [ "%{IPORHOST:clientip} - %{NGUSER:user} \[%{HTTPDATE:timestamp}\] (?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest}) %{NUMBER:response} (?:%{NUMBER:bytes}|-) (-|(%{DATA:referrer})) ] }
However, some (not all) logs I am parsing in contain the name of the channel a user is using on my Apache server.
A normal log including the word "channel" would be like this:
10.40.80.11 - [email protected] [03/Jan/2014:13:08:21 +0000] "GET /cgi-bin/feed/epg?channel=Bloomberg%20English&date=2016-01-03 HTTP/1.1" 200 368 "http://example.net/cgi-bin/feed/epg" "Mozilla/5.0"
The field "rawrequest" is saved on a separate field like this:
"GET /cgi-bin/feed/epg?channel=Bloomberg%20English&date=2016-04-04 HTTP/1.1"
Question: How can I save the names of the channels on a separate field considering not all logs contain the word channel in the field "rawrequest"?.
I have seen lots of examples but nothing similar.The character separating the channel to the rest of the string is "&". I would appreciate any help.
Solution:
match => { "request" => [ "channel=(?<Channels>[^&]+)" ] }
You existing grok is creating fields. You can create more fields from those fields by using another grok. A regexp like
channel=(?<myField>[^&]+)
should work, so your grok might look like this (untested):
grok {
match => { "request" => [ "channel=(?<myField>[^&]+)" ] }
}
This would make you a new field called 'myField'. Rename as desired.
Another option would be to change your original grok pattern, using more-specific built-in patterns rather than NOTSPACE. Check out the URI pattern. Unfortunately, that pattern doesn't create fields for you, so you'd have to modify it. If you put the URIPATHPARAM info in another field, you could then use the kv{} filter on it and parse all the pairs into their own fields.
Lots of options...