We have a python application that generates hourly rotating log, and we have set up the time for each rotation as start of each hour, i.e. the log rotation would happen at 10:00, 11:00, 12:00 .... The application is deployed in Kuberenetes pod and FluentD is used as side-car container so as to upload these log files to S3 bucket with the path being
s3bubcket/<id>/logs/%Y-%m-%d/%H/metrics
so we are trying to create different folders for each hour of the day and upload the logs of that hour into the bucket. In FluentD we have set the upload interval as 60s and 30s respectively but each time (eg at 10:00) FluentD uploads the new hour generated log file (which is now currently blank or has some logs for 10:00) to Amazon S3 bucket into the previous hour folder (which is 9:00), thus overwriting the previous hour logs (in our case logs for 9:00 till 9:59:59).
We have tried using timekey as 60s and 30s, delaying/increasing log rotation time and other settings (rotate_wait, refresh_interval, ) to get upload till the start of the next hour in proper folder but delaying leads to overwriting of logs and increasing time leads to loss of logs.
Logs for fluentd:
2022-04-06 10:59:00 +0000 [warn]: #1 def7zme94qc7q9folg5zly641/endpoints/p5wspkd85sr61/mep/2022-04-06/10/meplogs/logs_10.gz already exists, but will overwrite
2022-04-06 10:59:31 +0000 [warn]: #1 def7zme94qc7q9folg5zly641/endpoints/p5wspkd85sr61/mep/2022-04-06/10/meplogs/logs_10.gz already exists, but will overwrite
2022-04-06 10:59:57 +0000 [info]: #1 detected rotation of /var/log/cluster-env/gateway.log; waiting 120.0 seconds
2022-04-06 10:59:57 +0000 [info]: #0 detected rotation of /var/log/cluster-env/gateway.log; waiting 120.0 seconds
2022-04-06 10:59:57 +0000 [info]: #1 following tail of /var/log/cluster-env/gateway.log
2022-04-06 10:59:57 +0000 [info]: #0 following tail of /var/log/cluster-env/gateway.log
2022-04-06 10:59:57 +0000 [info]: #1 following tail of /var/log/cluster-env/gateway.log.2022-04-06_09
2022-04-06 11:00:01 +0000 [warn]: #1 def7zme94qc7q9folg5zly641/endpoints/p5wspkd85sr61/mep/2022-04-06/10/meplogs/logs_10.gz already exists, but will overwrite
So even when time is 11:00:01 logs are uploaded to the 10th hour folder.
FluentD config for logs
<worker 1>
<source>
tag "gateway-s3-logs"
@label @gateway-s3-logs
@type tail
path "/var/log/cluster-env/gateway.log"
pos_file "/var/log/cluster-env/gateway.log-s3-container-log-in-tail.pos"
read_from_head true
follow_inodes true
refresh_interval 5
rotate_wait 120
<parse>
@type "none"
unmatched_lines
</parse>
</source>
<label @gateway-s3-logs>
<match gateway-s3-logs>
@type s3
s3_bucket "sranjha-log-test"
s3_region "us-west-2"
path "def7zme94qc7q9folg5zly641/endpoints/p5wspkd85sr61/mep/%Y-%m-%d/%H/meplogs"
s3_object_key_format "%{path}/logs_%H.gz"
check_apikey_on_start false
overwrite true
utc
<buffer time>
@type "file"
path "/tmp/fluentd/mep-logs/logs/out-s3-buffer*"
chunk_limit_size 64MB
flush_at_shutdown true
timekey 30
timekey_wait 0
retry_timeout 30s
retry_type exponential_backoff
retry_exponential_backoff_base 2
retry_wait 1s
retry_randomize true
disable_chunk_backup true
retry_max_times 5
</buffer>
<local_file_upload>
file_path "/var/log/emr-on-cluster-env/gateway.log"
</local_file_upload>
<secondary>
@type "secondary_file"
directory "/var/log/fluentd/error/"
basename "s3-mep-error.log"
</secondary>
<format>
utc
localtime false
</format>
<inject>
localtime false
</inject>
</match>
</label>
</worker>
<worker 1>
So, is there a way where we can for fluentd to write files till hh:59:59 to the previous hour (hh) folder and from (hh+1:00:00) to new hour (hh + 1) folder.
Here is our config in which we send logs to s3 every minute.
type s3
<template>
s3_bucket "mybucket"
s3_region "my_region"
path my_path/
s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
time_slice_format ${tag}/YEAR=%Y/MONTH=%m/DAY=%d/HOSTNAME=${hostname}/HOUR=%H/%M
<format>
@type json
</format>
store_as gzip
<buffer time>
timekey 30
@type file
path /var/log/td-agent/buffer/s3/${tag}
timekey_wait 1m
chunk_limit_size 50m
flush_at_shutdown true
</buffer>
</template>
The last logs we get is of form 59_1.gz
and the last message inside this gzip is at "2022-04-07T18:59:44.777+0300"
.