Search code examples
logstashlogstash-grok

Catching a comma separated pattern with Grok


I am parsing though a set of logs where one field is giving me issues. The format is

header(ip, date etc.) field1=data, field2=data, field3=data, field4=data I have a general parser which read something like

match => [ "message","%{DATA:..header..} %{DATA}=%{DATA:service},%{DATA}=%{DATA:roles}],%{DATA}=%{DATA:macaddress},%{DATA}=%{DATA:nasip}"]

Some times the "value" portion for "roles" field looks like value, [Admin]. This is handled by the ] in %{DATA}=%{DATA:roles}], but in other cases I get

subvalue1, subvalue2, subvalue3, 

or

subvalue1, subvalue2, subvalue3, subvalue4, 

or

subvalue1, subvalue2, 

and the parser only captures the subval1. As you can see.. there is a variable number of sub vals and they are hard to catch when the ] is missing.

Here is an example of the kind of log creating issues:

local1--debug--10.47.130.2--2017-03-24--2017-03-24T11:29:51-‌​04:00--11:29:51,545 10.241.186.253 ZTP0 SESSION 20 1 0 Common.Username=LABF5CHK,Common.Service=F5_HealthCHK,Common.‌​Roles=Employee, [User Authenticated],Common.NAS-IP-Address=xxxxxxxxxxxx,Common.Req‌​uest-Timestamp=2017-‌​03-24 11:27:56-04

Is there a work around for this?


Solution

  • For variable length comma separated data I would suggest capturing the whole set of values as one field and then parsing that field using the csv filter.

    For parsing a set of key=value pairs I suggest using the kv filter.

    So your config will work something like this

    filter {
      grok {
        match => [ "message","%{DATA:..header..} %{GREEDYDATA:kv_pairs}"]
      }
      kv {
        source => "kv_pairs"
        field_split => ","
      }
      csv {
        # assumes that the key was 'roles'
        source => "roles"
        target => "role_list"
      }
    }
    

    I am not sure of the exact format of your log messages, but the kv filter about might screw up if your messages have a format, which doesn't separate the subvalue csv list from the list of k=v pairs like this:

    ...,key=value,roles=subval1,subval2,subval3,key2=value2...
    

    Or opens a list with [ but doesn't close it.

    Edit: It looks as though that first breaking case is in fact what you're facing.

    If the roles section is always in the same place, followed by the same key, you could match it using

    ...Common.‌​Roles=%{DATA:roles},Common.NAS-IP-Address=%{DATA:nasip}...
    

    If these kv pairs are consitently in the same arrangement, using this pattern should work. If a field is at all consistent or matchable by a more specific regex than .*? you should use that, so use the actual key names/patterns instead of %{DATA}= as that easily tempts mismatching.