Search code examples
macoslogstash

Logstash 6.4.1 (and higher) “logstash-input-file” using “read” mode not working as expected/desired on macOS


So I have a fairly modest Logstash setup for Apache logs that I am using on RedHat 7 (production) as well as macOS High Sierra (10.13.6) for development and something odd has happened since upgrading from Logstash version 6.3.2 to 6.4.1. I am using Homebrew on macOS to install and update Logstash and these issues persist even if I “nuke” my installed Hombrew items and reinstall.

Straight to the point.

Simply put, static data input files are not being read and ingested on startup in 6.4.1 as they once did on 6.3.2 and earlier. For 6.4.1 I need to manually cat log lines to the target path for Logstash to “wake up” and pick up these new lines even if I designate the new read mode.

At the end of the day, this setup doesn’t need a sincedb setup and can be restarted and read from the head of file to end and we are all happy… At least until Logstash 6.4.1… Now nobody is happy. What can be done to force Logstash to always read data from the beginning of files no matter what?

Details and discovery.

The Logstash setup I am using just does some filtering of Apache logs for input. The input config I am using reads as follows; note that the file path is slightly tweaked for privacy but is effectively exactly what I am using right now and have been using for the past year or so without issue:

input {

  file {
    path => "/opt/logstash/coolapp/access_log*"
    exclude => "*.gz"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    ignore_older => 0
    close_older => 3600
    stat_interval => 1
    discover_interval => 15
  }

}

The way I am using this for local development is simply getting a copy of remote Apache server logs and placing them in that /opt/logstash/coolapp/ directory.

Then when I startup Logstash via the command line like this with the -f potion set so my coolapp-apache.conf is read:

logstash -f coolapp-apache.conf

Logstash starts up locally, emits all of it’s pile of start up status messages until this final message:

[2018-09-24T12:40:09,458][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

Which to me indicates it’s fully up and running and checking my data collection output shows—if it is working—a flow of data pouring in… But when using Logstash 6.4.1 I see no data flowing in.

File input plugin works with tail mode.

Checking the newly updated documentation for the file input plugin (v4.1.5) shows there is a new mode option that has a read mode and a tail mode. Knowing that the default mode is tail I tested the setup by doing the following after starting up my local Logstash debugging setup. First I copied the access_log as follows:

cp /opt/logstash/coolapp/access_log /opt/logstash/coolapp/access_log_BAK

Then I zeroed out the main access_log file using :> like this:

:> /opt/logstash/coolapp/access_log

And finally I just ran cat and appended that copied file’s data to the original file like this:

cat /opt/logstash/coolapp/access_log_BAK > /opt/logstash/coolapp/access_log

The second I did that, lo and behold the data started to flow as expected! I guess the new file input plugin is focused on tailing a file more than read`ing? Anyway, that works but is clearly annoying. I don’t develop like this. I need Logstash to simply read the files and parse them.

File input plugin not working with read mode.

So I tried using the following setup to just read the files based on what I saw in the official Logstash file input mode documentation:

input {

  file {
    path => "/opt/logstash/coolapp/access_log"
    mode => "read"
    file_completed_action => "log"
    file_completed_log_path => "/Users/Giacomo1968/Desktop/access_log_foo"
  }

}

Of course things like access_log_foo is just for proof-of-concept file name for testing, but when all is said and done this read mode utterly does not work on macOS. I have even tried changing the path to something like my desktop and it doesn’t work. And the whole “zero out and then append a file” trick I used as explained in the “tail mode” explanation doesn’t cut it here since the file is not being tailed I guess?

So knowing all of that:

What can be done to force Logstash 6.4.1 to always read data from the beginning of files no matter what as it once did effortlessly in Logstash version 6.3.2 and previous?


Solution

  • Okay, I figured this out. I am now on Logstash 6.5 and my original config was as follows:

    input {
    
      file {
        path => "/opt/logstash/coolapp/access_log*"
        exclude => "*.gz"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        ignore_older => 0
        close_older => 3600
        stat_interval => 1
        discover_interval => 15
      }
    
    }
    

    When I redid it getting rid of ignore_older and adjusting close_older and stat_interval to use string_duration things started working again as expected.

    input {
    
      file {
        path => "/opt/logstash/coolapp/access_log*"
        exclude => "*.gz"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        close_older => "1 hour"
        stat_interval => "1 second"
        discover_interval => 15
      }
    
    }
    

    My assumption is that Logstash 6.3.2 interpreted ignore_older being set to 0 as false thus disabling ignore_older but in version 6.4 and higher that value is now being interpreted as an actual time value in seconds? Haven’t dug deeply into the source code, but everything I have experienced points to that being the issue.

    Regardless, this config now works and I am running Logstash 6.5 on macOS Mojave (10.14.1) without any issues.