Search code examples
logstashlogstash-groklogstash-configuration

How to generate single output json object from multiple log lines in logstash filter?


I am new to Logstash and Grok filter. I want to parse logs like these -

2018-01-11 17:17:16,071 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | CommittedVirtualMemorySize :: 401186816 
2018-01-11 17:17:16,071 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | FreePhysicalMemorySize :: 1751130112 
2018-01-11 17:17:16,072 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | FreeSwapSpaceSize :: 4294967295 
2018-01-11 17:17:16,694 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | ProcessCpuLoad :: -1.0 
2018-01-11 17:17:16,694 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | ProcessCpuTime :: 47471104300 
2018-01-11 17:17:16,698 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | SystemCpuLoad :: 1.0 
2018-01-11 17:17:16,698 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | TotalPhysicalMemorySize :: 4285849600 
2018-01-11 17:17:16,698 | DEBUG | [Thread-2] | com.example.monitor.MonitorHelper:cpuMonitoring(307) | TotalSwapSpaceSize :: 4294967295 

to a JSON Object like this -

{
  "timestamp": "2018-01-11 17:17:16,071",
  "log_level": "DEBUG",
  "thread_name": "Thread-2",
  "class": "com.example.monitor.MonitorHelper",
  "method": "cpuMonitoring",
  "line_number": "307",
  "CommittedVirtualMemorySize": "401186816",
  "FreePhysicalMemorySize": "1751130112",
  "FreeSwapSpaceSize": "4294967295",
  "ProcessCpuLoad": "-1.0",
  "ProcessCpuTime": "47471104300",
  "SystemCpuLoad": "1.0",
  "TotalPhysicalMemorySize": "4285849600",
  "TotalSwapSpaceSize": "4294967295"
}

As of now my grok pattern is -

%{TIMESTAMP_ISO8601:timestamp} \| %{LOGLEVEL:log_level} \| [(?\b[\w-]+\b)] \| %{JAVAFILE:class}:%{JAVAMETHOD:method}(%{NUMBER:line_number}) \| %{GREEDYDATA:log_message}

which provides multiple output lines for each input log line. JSON object looks like this-

{
  "timestamp": "2018-01-11 17:17:16,071",  
  "log_level": "DEBUG",
  "thread_name": "Thread-2",
  "class": "com.example.monitor.MonitorHelper",
  "method": "cpuMonitoring",
  "line_number": "307",
  "log_message": "CommittedVirtualMemorySize :: 401186816 "
}

can you please help me with what I need to look for in order to achieve this?


Solution

  • The first recommendation is to change the original log output into a single line.

    If you can't, and you're using filebeat to ship the file, use FB's multiline config to merge the lines before sending it to logstash.

    If you're not using filebeat, you can try to use the multiline codec in logstash.