Search code examples
elasticsearchlogstashelastic-stackfilebeatnxlog

Ship only a percentage of logs to logstash


How can I configure filebeat to only ship a percentage of logs (a sample if you will) to logstash?

In my application's log folder the logs are chunked to about 20 megs each. I want filebeat to ship only about 1/300th of that log volume to logstash.

I need to pare down the log volume before I send it over the wire to logstash so I cannot do this filtering from logstash it needs to happen on the endpoint before it leaves the server.

I asked this question in the ES forum and someone said it was not possible with filebeat: https://discuss.elastic.co/t/ship-only-a-percentage-of-logs-to-logstash/77393/2

Is there really no way I can extend filebeat to do this? Can nxlog or another product to this?


Solution

  • There's no way to configure Filebeat to drop arbitrary events based on a probability. But Filebeat does have the ability to drop events based on conditions. There are two way to filter events.

    Filebeat has a way to specify lines to include or exclude when reading the file. This is the most efficient place to apply the filtering because it happens early. This is done using include_lines and exclude_lines in the config file.

    filebeat.prospectors:
    - paths:
      - /var/log/myapp/*.log
      exclude_lines: ['^DEBUG']
    

    All Beats have "processors" that allow you to apply an action based on a condition. One action is drop_events and the conditions are regexp, contains, equals, and range.

    processors:
    - drop_event:
        when:
          regexp:
            message: '^DEBUG'