Search code examples
amazon-web-servicesaws-application-load-balancer

How to publish to one single file and change access log publication frequency in AWS


I have setup an amazon load balancer to publish access log to s3. The only problem now is that it creates multiple files, instead of one file that contains all access for maybe a day.

Poking around I found this documentation which states that:

Elastic Load Balancing publishes a log file for each load balancer node every 5 minutes.

This is super inconvenient. Because it means in a day I will have many tiny log files with log entries, instead of 1 log files with all the log entries.

I have tried finding in the documentation if it is possible to change this and so far, I have not found how.

Does anyone know how to do this? How to modify the publication frequency (I might want to change it to 10 minutes instead of 5 minutes) but most importantly how to have the log files published to one single file.


Solution

  • To the best of my knowledge, there's no super-secret undocumented API call that will let you change the log frequency. So here are a couple of alternatives:

    Use Athena

    This is, imo, the best solution: you can write SQL queries that slice up your data for analysis. The Athena docs describe how to configure this, for ALB as well as other data sources.

    Use the Command Line

    You can download the files to a temporary directory, then combine them into a single file:

    mkdir /tmp/$$
    aws s3 cp --recursive s3://bucket/prefix/AWSLogs/aws-account-id/elasticloadbalancing/region/yyyy/mm/dd/ /tmp/$$
    zcat /tmp/$$/* > access.log
    rm -rf /tmp/$$
    

    Use a Lambda

    If you're definitely want actual files, then a Lambda is imo the easiest way to combine them. I would trigger it at midnight, identify all of the files for the previous day (they have a date-stamped prefix), and combine them into one large file. I'd probably write in Java, because I think it's a little easier to stream data to S3 with Java (versus Python, which is my primary Lambda language).

    Or, you could probably Google for someone else's implementation of this.