Target: analyze big set of emails stored in files. I have used offlineimap tool to download emails to local files.
I am familiar with ELK a bit however not sure how to configure Logstash properly to store one event per one file.
I have not tried multiline plugin yet because I do not have complete set of rules for starting/ending files. I just want to parse all files and store one event per one file regardless of how it starts/ends.
NOTE: could not use Logstash imap plugin because it fetches and stores only new emails, it does not load all mails from the server.
Similar question for different use case: Logstash Multiline filter unfortunately does not have any answer for more than couple years.
Solution was suggested in comments at Logstash Multiline filter and it worked. Basically I had to add some string to end of all files and then use multiline plugin.
Created shell script to update all files with extra line:
for file in **/**/*; do
echo 'ENDOFMAILFILE' >> "$file"
done
after that I have used multiline plugin in logstash
input {
file {
type => "logmail"
path => [ "/var/log/mail/**/*" ]
start_position => "beginning"
codec => multiline {
pattern => "^ENDOFMAILFILE$"
negate => "true"
what => "previous"
}
}
}