Search code examples
amazon-s3seoamazon-cloudfrontreact-router

React Router + AWS Backend, how to SEO


I am using React and React Router in my single page web application. Since I'm doing client side rendering, I'd like to serve all of my static files (HTML, CSS, JS) with a CDN. I'm using Amazon S3 to host the files and Amazon CloudFront as the CDN.

When the user requests /css/styles.css, the file exists so S3 serves it. When the user requests /foo/bar, this is a dynamic URL so S3 adds a hashbang: /#!/foo/bar. This will serve index.html. On my client side I remove the hashbang so my URLs are pretty.

This all works great for 100% of my users.

  • All static files are served through a CDN
  • A dynamic URL will be routed to /#!/{...} which serves index.html (my single page application)
  • My client side removes the hashbang so the URLs are pretty again

The problem

The problem is that Google won't crawl my website. Here's why:

  • Google requests /
  • They see a bunch of links, e.g. to /foo/bar
  • Google requests /foo/bar
  • They get redirected to /#!/foo/bar (302 Found)
  • They remove the hashbang and request /

Why is the hashbang being removed? My app works great for 100% of my users so why do I need to redesign it in such a way just to get Google to crawl it properly? It's 2016, just follow the hashbang...

</rant>

Am I doing something wrong? Is there a better way to get S3 to serve index.html when it doesn't recognize the path?

Setting up a node server to handle these paths isn't the correct solution because that defeats the entire purpose of having a CDN.

In this thread Michael Jackson, top contributor to React Router, says "Thankfully hashbang is no longer in widespread use." How would you change my set up to not use the hashbang?


Solution

  • You can also check out this trick. You need to setup cloudfront distribution and then alter 404 behaviour in "Error Pages" section of your distribution. That way you can again domain.com/foo/bar links :)