Search code examples
loggingamazon-cloudwatch

How to download complete AWS CloudWatch log


QUESTION SUMMARY

How to download a complete log from CloudWatch using CLI tools?

The log that I download is incomplete. I know this because if I reverse the order, using --start-from-head, I get new content. Not just reversed-order.


RESEARCH

I am trying to trace a tricky intermittent failure in a (Flask/Zappa, AWS lambda) microservice.

I need to download the logs.

I can inspect the logs in CloudWatch:

Here's one containing the text I'm after:

enter image description here

However if I download this log, the downloaded file does not contain this text:

> aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]59bc7e539d7948688e0666f8ed14822a' > wtf.txt

> cat wtf.txt | grep "timer"

i.e. Nothing

Now if I add --start-from-head, now I see it:

> aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]59bc7e539d7948688e0666f8ed14822a' --start-from-head  > wtf.txt

> cat wtf.txt | grep "timer"
        "message": "> > >  starting game timer  < < <\n",

From https://docs.aws.amazon.com/cli/latest/reference/logs/get-log-events.html I observe:

--limit (integer)

The maximum number of log events returned. If you don't specify a value, the maximum is as many log events as can fit in a response size of 1 MB, up to 10,000 log events.

... and:

> ls -l wtf.txt
-rw-r--r--  1 pi  staff  1247053  3 Dec 10:55:14 2018 wtf.txt

So it is going over 1MB. So it appears that the log is too long. The text I'm after is at the earliest period in the log.

So the question becomes: How to download the complete log?

I try setting a higher --limit, but get:

An error occurred (InvalidParameterException) when calling the GetLogEvents operation: 1 validation error detected: Value '999999' at 'limit' failed to satisfy constraint: Member must have value less than or equal to 10000

And 10000 is the default! And setting an arbitrary limit is ugly anyway. Whatever I set there is a risk that the log will be longer.

How about using the documented "nextForwardToken" key?

def get_complete_log(stream_name):
    nextForwardToken = None

    while True:
        param_group =  " --log-group-name '/aws/lambda/api-dev'"
        param_stream = " --log-stream-name '" + stream_name + "'"
        param_token = (" --next-token '" + nextForwardToken + "'") if nextForwardToken else ""

        params = param_group + param_stream + param_token

        cmd = "aws logs get-log-events" + params + " > logs/tmp.txt"
        print(cmd)
        system(cmd)      

        with open('logs/tmp.txt','r') as f:
            tmp = f.read()

            print('CONTENTS:', tmp[:120], '\n')

            J = json.loads( tmp )

        nextForwardToken = J.get("nextForwardToken")

        if not nextForwardToken:
            break


get_complete_log( "2018/12/01/[$LATEST]59bc7e539d7948688e0666f8ed14822a" )

And if I inspect the output:

aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]030c7bd5c6ff4d9eb3bb56b8607746b8' > logs/tmp.txt
CONTENTS: {
    "events": [
        {
            "timestamp": 1543707627572,
            "message": "START RequestId: 7b34fa3b-f5 

aws logs get-log-events --log-group-name '/aws/lambda/api-dev' --log-stream-name '2018/12/01/[$LATEST]030c7bd5c6ff4d9eb3bb56b8607746b8' --next-token 'f/34426362085021867195594556764906427633106607331166978053' > logs/tmp.txt
CONTENTS: {
    "events": [],
    "nextForwardToken": "f/34426362085021867195594556764906427633106607331166978053",
    "nextBackw 

So everything except the first call returns "events": [] and "nextForwardToken": is the same token that was passed in!


Solution

  • I would recommend trying out this CLI tool. In my opinion, it's much more reliable than the AWS console and the AWS CLI tool. I've used it to search through massive log streams in CloudWatch. You can easily specify a time-range to search over or even grep over CloudWatch log streams. You can also look at a log stream in real-time. The example below searches over all the log streams in a group for the specified time-range (see also that I'm grepping for the pattern ERROR and outputting to a file and the console using tee):

    awslogs get my_log_group ALL --start='23/1/2015 12:00' --end='23/1/2016 13:00' | grep ERROR | tee errlogs.txt