I have a folder with log files from 2016-present and setup filebeat with "ignore_older: 48h". All the files get rotated so that "log" is always the new one, "log.1" is the next etc. Logs are on linux NFS partition mounted on the logstash host.
I expect filebeat to get only log files that where changed in the last 24h and ignore the older ones.
The above happens except from time to time it also gets older files in no specific order.
I ran "stat" command on one of the older file from 2018 and i see the following:
Access: 2019-03-02 03:15:32.254460960 +0000
Modify: 2018-09-06 13:12:00.331460890 +0000
Change: 2019-02-28 03:34:33.946462475 +0000
I run filebeat version 6.4.2
Is this data confusing Logstash? What is it actually looking at when checking if a file has changed. How can i stop it from taking older files.
UPDATE:
My filebeat configuration looks like this:
- type: log
enabled: true
paths:
- /path/to/my/log/file/log*
fields:
logname: "log.name"
include_lines: ["SOME_TEXT"]
ignore_older: 48h
Logs are in CSV format.
On another host i do the same but with logstash directly, the input config is like this:
input {
file {
path => "/path/to/my/log/file/log*"
mode => "tail"
start_position => "beginning"
close_older => "24h"
ignore_older => "2w"
}
}
I have the same issue here.
You can try to do two things, one is to remove the * after log in the path like this
- /path/to/my/log/file/log
Since filebeat will read a rotated log file even after it is moved until it reaches a certain age.
Or for logstash the path parameter is an array and you create a list of files to be read, if you know how often the files get rotated:
path => [ "path/to/my/log/file.log", "/path/to/my/log/file1.log", "path/to/my/log/file2.log"]