I'm trying to implement a "proxy" to multiple websites using lambda@edge on AWS Cloudfront.
My setup is roughly:
DNS: *.domain.com -> some_uuid.cloudfront.net (Cloudfront distribution)
Cloudfront: some_uuid.cloudfront.net -> s3 bucket origin
s3 bucket: websites/ (a folder that contains multiple websites)
lambda@edge function: defined as origin-request
My lambda@edge function is quite simple:
check if the website resource exists in the s3 bucket.
if it does, change the request uri to the resource s3 url.
if not, send a request to a backend server to render the resource, store on s3 and return it.
I'm having trouble getting the origin domain of the website. For example, if I try to access "my_website.domain.com" - in my lambda function i don't have this domain info from the request.
I think I can implement another lambda@edge function as viewer request to pass the domain as a header, but if I can, I prefer to avoid that.
Is there any other solution?
Thanks
So the problem with your use case is that the value of the host header exposed to your origin-request L@E function is the domain name of the s3 bucket, and not the original host header CloudFront has received from a viewer, correct?
In order to see the original Host header CloudFront received from the viewer, you need to whitelist it. However, CloudFront currently doesn't allow to whitelist headers for s3 origins. This is a bug/limitation that should be fixed by CloudFront. There is a workaround though. If the s3 bucket is publicly accessible (i.e. you are not using origin access identity), you can configure your S3 origin as a custom origin using a website endpoint like mybucket.s3-website-us-east-1.amazonaws.com. Then, you will be able to whitelist the host header and see the domain name of your website as requested by the viewer. You can then modify the origin request according to your use case. Don't forget also to change the host header back to the s3 endpoint so that S3 would accept the request.