I am trying to process log files with .gz extension in fluentd using cat_sweep plugin, and failed in my attempt. As shown in the below config, I am trying to process all files under /opt/logfiles/* location. However when the file format is .gz, cat_sweep is unable to process the file, and starts deleting the file, but if I unzip the file manually inside the /opt/logfiles/ location, cat_sweep is able to process, the file.
<source>
@type cat_sweep
file_path_with_glob /opt/logfiles/*
format none
tag raw.log
waiting_seconds 0
remove_after_processing true
processing_file_suffix .processing
error_file_suffix .error
run_interval 5
</source>
So now I need some plugin that can unzip a given file. I tried searching for plugins that can unzip a zipped file. I came close when I found about the plugin, which acts like a terminal, where I can use something like gzip -d file_path
Link to the plugin:
http://docs.fluentd.org/v0.12/articles/in_exec
But the problem I see here, is that I cannot send the path of the file to be unzipped at run-time.
Can someone help me with some pointers?
Looking at your requirement, you can still achieve it by using in_exec module, What you have to do is, to simply create a shell script which accepts path to look for .gz files and the wildcard pattern to match file names. And inside the shell script you can unzip files inside the folder_path that was passed with the given wildcard pattern. Basically your shell execution should look like:
sh unzip.sh <folder_path_to_monitor> <wildcard_to_files>
And use the above command in in_exec tag in your config. And your config will look like:
<source>
@type exec
format json
tag unzip.sh
command sh unzip.sh <folder_path_to_monitor> <wildcard_to_files>
run_interval 10s
</source>