Search code examples
regexlogstashelastic-stacklogstash-grok

Split log message on space for grok pattern


I am two days new to grok and ELK. I am struggling with breaking up the log messages based on space and make them appear as different fields in the logstash.

My input pattern is: 2022-02-11 11:57:49 - app - INFO - function_name=add elapsed_time=0.0296 input_params=6_3

I would like to see different fields in the logstash/kibana for function_name, elapsed_time and input_params.

At the moment, I have a following .conf

input{
  file{
  path => "/path/to/log/file"
  start_position => "beginning"
  }
}
filter{
  grok{
  match => {"message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} %{(?<function_name>[^.]*)\.(?<elapsed_time>[^.]*)\.(?<input>[^.]*)}"}
  }
    date {
    match => ["timestamp", "ISO8601"]
    }
    function_name {
    match => ["function_name", "DATA"]
    }
    elapsed_time {
    match => ["elapsed_time", "BASE16FLOAT"]
    }
    input {
    match => ["input", "DATA"]
    }
}
output{
  elasticsearch{
  hosts => ["localhost:9200"]
  index => "math_apis"
  }
  stdout{codec => rubydebug}
}

But this only produces a following message in logstash

{
          "host" => "hostname",
    "@timestamp" => 2022-02-11T06:27:49.404Z,
       "message" => "2022-02-11 11:57:49 - app - INFO - function_name=add elapsed_time=0.0296 input_params=6_3",
          "path" => "path/to/log/file",
      "@version" => "1",
          "tags" => [
        [0] "_grokparsefailure"
    ]
}

Solution

  • You can use the following pattern:

    %{TIMESTAMP_ISO8601:timestamp} - \S+ - %{LOGLEVEL:log_level} - function_name=%{NOTSPACE:function_name} elapsed_time=%{NOTSPACE:elapsed_time} input_params=%{NOTSPACE:input}
    

    Details:

    • %{TIMESTAMP_ISO8601:timestamp} - timestamp field
    • - - a literal string
    • \S+ - any one or more non-whitespace chars
    • - - a literal string
    • %{LOGLEVEL:log_level} - LOGLEVEL pattern
    • - function_name= - a literal string
    • %{NOTSPACE:function_name} - function_name field of one or more non-whitespace chars
    • elapsed_time= - space and elapsed_time= string
    • %{NOTSPACE:elapsed_time} - elapsed_time field of one or more non-whitespace chars
    • input_params= - literal string
    • %{NOTSPACE:input} - input field of one or more non-whitespace chars.

    See more about Grok patterns here.

    Test output:

    {
      "timestamp": [
        [
          "2022-02-11 11:57:49"
        ]
      ],
      "YEAR": [
        [
          "2022"
        ]
      ],
      "MONTHNUM": [
        [
          "02"
        ]
      ],
      "MONTHDAY": [
        [
          "11"
        ]
      ],
      "HOUR": [
        [
          "11",
          null
        ]
      ],
      "MINUTE": [
        [
          "57",
          null
        ]
      ],
      "SECOND": [
        [
          "49"
        ]
      ],
      "ISO8601_TIMEZONE": [
        [
          null
        ]
      ],
      "log_level": [
        [
          "INFO"
        ]
      ],
      "function_name": [
        [
          "add"
        ]
      ],
      "elapsed_time": [
        [
          "0.0296"
        ]
      ],
      "input": [
        [
          "6_3"
        ]
      ]
    }