Search code examples
amazon-web-servicesaws-lambdaamazon-cloudfrontamazon-cloudwatch

Does Cloudwatch not log Lambda functions from Cloudfront?


Does Cloudfront require special settings to trigger a log?

I have the following flow:

Devices -> Cloudfront -> API Gateway -> Lambda Function

which works, but Cloudwatch doesn't seem to create logs for the lambda function (or API Gateway). However, the following flow creates logs:

Web/Curl -> API Gateway -> Lambda Function

Solution

  • In comments, above, we seem to have arrived at a conclusion that unanticipated client-side caching (or caching somewhere between the client and the AWS infrastructure) may be a more appropriate explanation for the observed behavior, since there is no known mechanism by which an independent CloudFront distribution could access a Lambda function via API Gateway and cause those requests not to be logged by Lambda.

    So, I'll answer this with a way to confirm or reject this hypothesis.

    CloudFront injects a header into both requests and responses, X-Amz-Cf-Id, containing opaque tokens that uniquely identify the request and the response. Documentation refers to these as "encrypted," but for our purposes, they're simply opaque values with a very high probability of uniqueness.

    In spite of having the same name, the request header and the response header are actually two uncorrelated values (they don't match each other on the same request/response).

    The origin-side X-Amz-Cf-Id is sent to the origin server in the request is only really useful to AWS engineers, for troubleshooting.

    But the viewer-side X-Amz-Cf-Id returned by CloudFront in the response is useful to us, because not only is it unique to each response (even responses from the CloudFront cache have different values each time you fetch the same object) but it also appears in the CloudFront access logs as x-edge-request-id (although the documentation does not appear to unambiguously state this).

    Thus, if the client side sees duplicate X-Amz-Cf-Id values across multiple responses, there is something either internal to the client or between the client and CloudFront (in the client's network or ISP) that is causing cached responses to be seen by the client.

    Correlating the X-Amz-Cf-Id from the client across multiple responses may be useful (since they should never be the same) and with the CloudFront logs may also be useful, since this confirms the timestamp of the request where CloudFront actually generated this particular response.

    tl;dr: observing the same X-Amz-Cf-Id in more than one response means caching is occurring outside the boundaries of AWS.


    Note that even though CloudFront allows min/max/default TTLs to impact how long CloudFront will cache the object, these settings don't impact any downstream or client caching behavior. The origin should return correct Cache-Control response headers (e.g. private, no-cache, no-store) to ensure correct caching behavior throughout the chain. If the origin behavior can't be changed, then Lambda@Edge origin response or viewer response triggers can be used to inject appropriate response headers -- see this example on Server Fault.

    Note also that CloudFront caches 4xx/5xx error responses for 5 minutes by default. See Amazon CloudFront Latency for an explanation and steps to disable this behavior, if desired. This feature is designed to give the origin server a break, and not bombard it with requests that are presumed to continue to fail, anyway. This behavior may cause various problems in testing as well as production, so there are cases where it should be disabled.