Search code examples
logstashlogstash-grok

DATA pattern in logstash - grok


I am trying to understand the DATA pattern in grok plugin of logstash. As per the documentation DATA matches as following:

DATA .*? --> I interpreted it as anything with length 1 to n [Please correct me if my understanding is wrong].

In my script, it fails to parse my input properly.
Logstash conf:

input{
file {
        path => ["/home/osboxes/logstash_conf/mydir/test_logs/*"]
        start_position => beginning
        sincedb_path => "/home/osboxes/logstash_conf/mydir/.sincedb"
   }
}
filter{
        grok {
                match => { "message" => "^%{TIMESTAMP_ISO8601:timeStamp},%{DATA:ID},%{DATA:somedata}" }
        }
}
output {
   stdout {
        codec => json_lines
   }
} 

Input:

2017-01-09 02:00:03.887,a,a 

Output:

{
    "message": "2017-01-09 02:00:03.887,a,a",
    "@version": "1",
    "@timestamp": "2017-01-09T12:28:20.958Z",
    "path": "/home/osboxes/logstash_conf/mydir/test_logs/data",
    "host": "osboxes",
    "timeStamp": "2017-01-09 02:00:03.887",
    "ID": "a"
}

I expected the tag somedata will be filled with value [as it did for tag ID] but it is omitted from the output. Anyone please help me understanding the behavior of DATA pattern.


Solution

  • .*? Matches between zero and unlimited times, as few times as possible, expanding as needed. The fact that it can match zero times is the reason you don't see a result.

    Written out, the problematic part looks like this:

    ,(.*?),(.*?) ( Capture groups added for readability)

    This matches : ,a,

    1.Take the first , and match it.

    2.Try to match .*? with as few as possible (character by character until pattern is valid) this matches the a

    3.Try to match the next ,. This suceeds so the first .*? is done.

    4.Try to match .*?. Since this can match zero times it will do so and the matching is complete.


    The simple solution to your problem is to add a $ at the end of your pattern. $ is a end of string anchor so your second .*? is forced to match the other a.