Search code examples
amazon-web-servicescdnamazon-cloudfront

Can a distribution automatically match the subdomain from a request to figure out the origin


We're adding a lot of nearly equivalent apps on the same domain, each app can be accessed through its specific subdomain. Each app has got specific assets (not a lot).

Every app refer to the same cdn.mydomain.com to get the assets from cloudfront.

Assets are named spaced. For exemple:

app1:

  • Can be reached from app1.mydomain.com
  • assets url is cdn.mydomain.com/assets/app1
  • cloudfront orgin app1.mydomain.com
  • cache behavior /assets/app1/* to origin app1.mydomain.com

When Cloudfront doesn't have the assets in cache, it downloads it from the right origin.

Actually we're making a new origin and cache behavior on the same distribution each time we add a new app.

We're trying to simplify that process so Cloudfront can be able to get the assets from the right origin without having to specify it. And this will resolve the problem if we hit the limit of the number of origin in one distribution.

How can we do this and is it possible?

We're thinking of making an origin of mydomain.com with a cache configure to forward the host header but we're not sure that this will do the trick.


Solution

  • Origins are tied to Cache Behaviors, which are tied to path patterns. You can't really do what you're thinking about doing.

    I would suggest that you should create a distribution for each app and each subdomain. It's very easy to script this using aws-cli, since once you have one set up the way you like it, you can use its configuration output as a template to make more, with minimal changes. (I use a Perl script to build the final JSON to create each distribution, with minimal inputs like alternate domain name and certificate ARN and pipe its output into aws-cli.)

    I believe this is the right approach, because:

    • CloudFront cannot select the origin based on the Host header. Only the path pattern is used to select the origin.
    • Lambda@Edge can rewrite the path and can inspect the Host header, but it cannot rewrite the path before the matching is done that selects the Cache Behavior (and thus the origin). You cannot use Lambda@Edge to cause CloudFront to switch or select origins, unless you generatre browser redirects, which you probably don't want to do, for performance reasons. I've submitted a feature request to allow a Lambda trigger to signal CloudFront that it should return to the beginning of processing and re-evaluate the path, but I don't know if it is being considered as a future feature -- AWS tends to keep their plans for future functionality close to the vest, and understandably so.
    • you don't gain any efficiency or cost savings by combining your sites in a single distribution, since the resources are different
    • if you decide to whitelist the Host header, that means CloudFront will cache responses, separately, based on the Host header, the same as it would do if you had created multiple distributions. Even if the path is identical, it will still cache separate responses if the Host header differs, as it must to ensure sensible behavior
    • the default limit for distributions is 200, while the limit for origins and cache behaviors is 25. Both can be raised by request, but the number of distributions they can give you is unlimited, while the other resources are finite because they increase the workload on the system for each request and would eventually have a negative performance impact
    • separate distributions gives you separate logs and reports
    • provisioning errors have a smaller blast radius when each app has its own distribution

    You can also go into Amazon Certificate Manager and a wildcard certificate for * *.cdn.example.com. Then use e.g. app1.cdn.example.com as the alternate domain name for the app1 distribution and attach the wildcard cert. Then reuse the same cert on the app2.cdn.app.com distribution, etc.

    Note that you also have an easy migration strategy from your current solution: You can create a single distribution with *.cdn.example.com as its alternate domain name. Code the apps to use their own unique-name-here.cdn.example.com. Point all the DNS records here. Later, when you create a distribution with a specific alternate domain name foo.cdn.example.com, CloudFront will automatically stop routing those requests to the wildcard distribution and start routing them to the one with the specific domain. You will need to change the DNS entry... but CloudFront will actually handle the requests correctly, routing them to the newly-created distribution, before you change the DNS, because it has some internal magic that will match the non-wildcard hostname to the correct distribution regardless of whether the browser connects to the new endpoint or the old... so the migration event should pretty much be a non-event.

    I'd suggest the wildcard strategy is a good one, anyway, so that your apps are each connecting to a specific endpoint hostname, allowing you much more flexibility in the future.