Search code examples
logstashlogstash-configuration

Logstash: Custom delimiter for multi-line XML logs


I have XML logs where logs are closed with "=======", e.g.

<log>
  <level>DEBUG</level>
  <message>This is debug level</message>
</log>

=======

<log>
  <level>ERROR</level>
  <message>This is error level</message>
</log>

=======

Every log can span across multiple lines.

How to parse those logs using logstash?


Solution

  • This can be done using multiline codec. The delimiter "=======" can be used in pattern like this

    input {
      file {
        type => "xml"
        path => "/path/to/logs/*.log"
        codec => multiline {
          pattern => "^======="
          negate => "true"
          what => "previous"
        }
      }
    }
    
    filter {
      mutate {
        gsub => [ "message", "=======", ""]
      }
      xml {
        force_array => false
        source => "message"
        target => "log"
      }
      mutate {
        remove_field => [ "message" ]
      }
    }
    
    output {
      elasticsearch {
        codec => json
        hosts => ["http://localhost:9200"]
        index => "logs-%{+YYYY.MM.dd}"
      }
    }
    

    Here the combination of pattern and negate => true means: if a line does not start with "=======" it belongs to the previous event (thus what => "previous"). When a line with the delimiter is hit, we start a new event. In the filter the delimiter is simply removed with gsub and XML is parsed with xml plugin.