Search code examples
jsonloggingaws-lambdaamazon-cloudwatch

Why are my AWS Cloudwatch logs splitting up JSON strings into two lines?


I am debugging an application that is printing single lines of JSON from AWS Lambdas. When I look at the CloudWatch logs, I am finding that "sometimes" they are being split into multiple lines.

Screenshot of Cloudwatch Logs

In this image, you can see that the JSON string is split into two lines (boxed in red). I have verified that when I concatenate both parts, it produces a valid JSON string. The first chunk is 65536 characters long. The second chunk is 24245 characters long. Total 89781 characters long across 2 lines.

Is there an explanation or documented behavior for this? Or, maybe an AWS Setting that I misconfigured? I would like these logs to appear as one line.

My logging code

process.stdout.write(fastStringify(shrinkify(obj)) + "\n");

where

function shrinkify(obj: Object): string {
  // removes key-values until
  // Buffer.byteLength(JSON.stringify(obj), "utf8")
  // is < X
}

function fastStringify(obj: Object): string {
  try {
    return JSON.stringify(obj);
  } catch (e) {
    return require("safe-json-stringify")(obj);
  }
}

I'm using the following Lambda Runtime

Node.js 14
Architecture x86_64

Here's some of the possibly relevant dependencies from package.json file.

{
  "dependencies": {
    "aws-sdk": "2.1308.0",
    "express": "4.18.2",
    "opentracing": "0.14.7",
    "safe-json-stringify": "1.2.0",
    "serverless-http": "2.7.0"
  }
}

Observations

The funny thing is that it only happens occasionally. I have seen logs that are bigger than 65536 characters and NOT split. Example, 89775 characters across 1 line.

The other funny thing is that the split always seems to happen at some power of 2 (e.g. 2^16, 2^17, etc.). Except for 2^18. It seems to be 2^18 - 26. Cloudwatch Log events have a 256kb limit, so maybe those 26 bytes are reserved for the metadata?

I also found a JSON log that was absurdly huge. Cloudwatch split into 5 parts.

1st: 131072 characters
2nd: 262118 characters
3rd: 262118 characters
4th: 262118 characters
5th:  56603 characters

Lastly, if we look back at the attached picture above, I found it interesting that the second part of the JSON string appears after the END and REPORT of the Lambda execution.

Hypothesis 1

CloudWatch states that log events can be up to 256KB. It's not clear what happens if it's bigger, but some articles online seem to say that Cloudwatch drops these events.

The CloudWatch Logs agent is a common and recommended method for exporting logs. The agent skips log events that exceed 256 KB in size. The agent also automatically adjusts the length of each batch to comply with the 1 MB restriction.

Maybe Cloudwatch actually splits log events after a certain size? This would not explain why some log events get split while others do not. It also does not explain why the split happens at lengths shorter than 256kb (e.g. 64kb, as seen in the above screenshot).

References

Hypothesis 2

Maybe one of my node modules is overwriting the process.stdout.write function to account for log limits when running as a Lambda? I looked into whether serverless-http has "features" that does this, but did not see any reports of this happening.

No further progress.

Hypothesis 3

Maybe my string has a random /r or /n inside of it? I checked boundary of the split in the attached image. Neither of those strings were there.

Hypothesis 4

Maybe AWS Lambdas streams stdout asynchronously to Cloudwatch? ...and when the lambda exits while the stream is still in progress, it decides to split the log streams at these boundaries and flushes the remaining stream buffer after the execution context ends?

This is my most compelling hypothesis, but I couldn't find any articles to support this. This could explain why the other parts of the JSON string appears after the END and REPORT of the Lambda logs.

References

Workaround

I decided to implement the shrinkify() function as seen in my code snippet above. It removes keys until the the length is less than X (which is configurable). I'll set this limit to 64kb so that the logs will appear as one line.

This is unfortunate as I will be unable to use the full 256kb limit specified by CloudWatch, but I hope someone is able to provide an explanation if not a solution to this.


Solution

  • Turns it really had to do with the stream configs! I read up on process.stdout on the NodeJS documentation, and they mention how different Operation Systems have different behavior on synchronous vs asynchronous logging.

    I searched online for something that would force the behavior I wanted, and I eventually found the following solution.

    process.stdout._handle.setBlocking(true);
    

    A bit of warning with this solution, I hear that it's not officially supported by NodeJS so it might disappear in some future version.

    If you need a Typescript version, as I did, follow the link below.