AWS Cloudfront latency: origin fetch vs internet

Is the first fetch of any given file from an origin via Cloudfront faster on average than fetching directly from the origin over the internet? I'm wondering if the AWS backbone somehow outperforms the speed of the public internet.

Eg if a user from Sydney wants a file from my S3 in Europe, and Cloudfront doesn't yet have it cached, is it quicker to get it directly over the internet, or for Cloudfront to fetch it from the European origin to the Sydney edge cache and to the internet for the last few hops? But that's just an example. Users will be worldwide, and many will be in Europe, close to the origin.

I do understand that AFTER that request to origin the CDN will cache the file and subsequent requests from Sydney for that same file within the file's TTL will be much faster, but subsequent requests will not happen often in my use case...

I have a large collection of small files (<1MB) on S3 which seldom change, and each of them individually is seldom downloaded and will have a TTL of about 1 week.

I'm curious if putting Cloudfront in front of S3, in this case, will be worth it even though I won't get much value from the edge caching service that the CDN provides.

So should I expect to see any latency decrease on average for those first fetch scenarios?

EDIT: I subsequently found this article which mentions "Persistent Connections... reduces overall latency...", but I suspect it just means better performance of the Cloudfront-to-origin subsystem, and not necessarily better end-to-end perf for the user.

Solution

I'm wondering if the AWS backbone somehow outperforms the speed of the public internet.

The idea is that it should.

You should see an overall improvement, because CloudFront does several useful things, even when not caching:

brings the traffic onto the AWS managed network as close to the viewer as practical, with the traffic traversing most of its distance on the AWS network rather than on the public Internet.
sectionalizes the TCP interactions between the browser and the origin by creating two TCP connections¹, one from browser to CloudFront, and one from CloudFront to origin. The back-and-forth messaging that occurs for connection setup, then TLS negotiation, then HTTP request/response, are optimized.
(optional) provides http/2 to HTTP/1.1 gateway/translation, allowing the browser to make concurrent requests over a single http/2 connection while converting these to multiple HTTP/1.1 requests on separate connections to the origin.

There are some minor arbitrage opportunities in the discrepancies between costs for traffic leaving a region bound for the Internet and traffic leaving a CloudFront edge bound for the Internet. (Traffic outbound from EC2/S3 to CloudFront is not billable). In many cases, these work in your favor, such as a viewer in a low cost area accessing a bucket in a high cost area, but they are almost always asymmetric. A London viewer and a Sydney bucket is $0.14/GB accessing the bucket directly, but $0.085/GB accessing the same bucket through CloudFront. On the flip side, a Sidney viewer accessing a London bucket is $0.09/GB direct to the bucket, $0.14/GB through CloudFront. London viewer/London bucket is $0.085 through CloudFront or $0.09/GB direct to the bucket. It is my long-term assumption thst these discrepancies represent the cost of Internet access compared to the cost of AWS's private transport. You can also configure CloudFront, via the price class feature, to use only the lower cost edges, which is not guaranteed to actually use only the lower cost edges for traffic, but rather guaranteed not to charge you a higher price if a lower cost edge is not used.

Note also that there are two (known) services that use CloudFront with caching always disabled:

Enabling S3 Transfer Acceleration on a bucket is fundamentally a zero-config-required CloudFront distribution without the cache enabled. Transfer acceleration has only three notable differences compared to a self-provisioned CloudFront + S3 arrangement: specifically, it can pass-through signed URLs that S3 understands and accepts (with S3 plus your own CloudFront, you have to use CloudFront signed URLs, which use a different algorithm) and the CloudFront network is bypassed for users who are geographically close to the bucket region, which also eliminates the Transfer Acceleration surcharge for that request. The third difference is that it almost always costs more than your own CloudFront + S3.

AWS apparently believes the value added here is significant enough that for the feature to cost more than using S3 + CloudFront yourself makes sense. On occasion, I have used it to squeeze a bit more optimization out of a direct-to-bucket arrangement, because it is an easy change to make.

Find the Transfer Acceleration speed test on this page and observe what it does. This is upload, rather than download, but it is the same idea -- it gives you reasonable depiction of the differences between the public Internet and the AWS "Edge Network" (the CloudFront infrastructure).

API Gateway edge-optimized APIs also do route through CloudFront for performance reasons. While API Gateway does offer optional caching, it uses a caching instance, not the CloudFront cache. API subsequently introduced a second type of API endpoint that doesn't use CloudFront, because when you are making requests within the same actual AWS region, it doesn't make sense to send the request through extra hardware. This also makes deploying API Gateway behind your own CloudFront a bit more sensible, avoiding an unnecessary second pass through the same infrastructure.

¹two TCP connections may actually be three, which should tend to further improve performance because the boundary between each connection provides a content buffer that allows for smoother and faster transport and changes the bandwidth-delay product in favorable ways. Since some time in 2016, CloudFront has two tiers of edge locations, the outer "global" edges (closest to the viewer) and the inner "regional" edges (within the actual AWS regions). This is documented but the documentation is very high-level and doesn't explain the underpinnings thoroughly. Anecdotal observations suggest that each global edge has an assigned "home" regional edge that is the regional edge in its nearest AWS region. The connection goes from viewer, to outer edge, to the inner edge, and then to the origin. The documentation suggests that there are cases where the inner (regional) edge is bypassed, but observations suggest that these are the exception.