Search code examples
amazon-web-servicesweb-crawleramazon-cloudfrontgoogle-searchrobots.txt

How to update/replace robots.txt file in aws cloudfront


I have a website www.example.com. So when I access robots.txt file like www.example.com/robots.txt and it shows multiple lines text which is created/prepared by SEO team.

So there is a sub domain assets.example.com which is pointing to CloudFront. When I access the robots.txt file through CloudFront URL like https://assets.example.com/robtos.txt then it shows below result in the browser.

User-agent: *
Disallow: / 

So there is a request to update the robots.txt file content in AWS CloudFront so https://assets.example.com/robtos.txt & https://www.example.com/robtos.txt should show the same text. I didn't find anywhere robots.txt is placed in cloud-front.

Is it possible to update robots.txt in cloudfront? Is there any role of CloudFront here? Or we need to update the robots.txt for assets.example.com same as configured for example.com?

Please help me out. I'm very confused here.


Solution

  • You can point the CloudFront distribution (assets.example.com), then add a new origin with the domain name www.example.com then add a new cache behavior with the path pattern robots.txt and add the origin to it.

    This setup takes a request to assets.example.com/robots.txt and forwards it to www.example.com/robots.txt. With this, you can remove the duplications.