Search code examples
javascriptangularjsprerender

Prerender + AngularJS - Crawlers time out


Info about setup:

I’ve installed prerender (https://github.com/prerender/prerender) succesfully on my own server, Ubuntu 16.

This is my .htaccess, it rewrites the url to the prerender when a crawler is detected. Example: http://www.example.nl/63/Merry becomes http://example.nl:3000/http://www.example.nl/63/Merry

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]

RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|redditbot|slackbot|msnbot|googlebot|duckduckbot|bingbot|rogerbot|linkedinbot|embedly|flipboard|tumblr|bitlybot|SkypeUriPreview|nuzzel|Discordbot|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
RewriteCond %{QUERY_STRING} ^_escaped_fragment_=$
RewriteRule ^(.*)$  http://example.nl:3000/http://www.example.nl/$1? [R=301,L]
#RewriteRule ^(.*)$  http://art.example.net/$1? [R=301,L] 

RewriteRule ^(.*)/(.*)$ /#$1/$2 [NC,L]

The problem:

Meta data is not being loaded on Skype, Reddit, Twitter when using prerender. Rewriting the url to the old PHP website: http://art.example.net (currently commented in the htaccess) does work. Because all meta tags on the PHP and Angular website are the same, prerenderer is most likely the cause of the issue.

Error example from Twitter (https://cards-dev.twitter.com/validator using url: http://example.nl/63/Merry) using Prerender:

ERROR: Failed to fetch page due to: HttpConnectionTimeout
WARN:  this card is redirected to http://example.nl:3000/http://www.example.nl/63/Merry

Twitter when redirecting to art.example.net (also using the main URL: http://example.nl/63/Merry)

INFO:  Page fetched successfully
INFO:  19 metatags were found
INFO:  twitter:card = summary_large_image tag found
INFO:  Card loaded successfully
WARN:  this card is redirected to http://art.example.net/63/Merry

Using the PHP version works and all meta data is being loaded.

In the future I’d like to completely remove the PHP website, so I really would love for it to work with Prerender. Prerender does work in Discord and Postman (with modified User Agent header). I just don't know why it doesn't work for some other agents.


Solution

  • Your rewrite rule should be a proxy, not a redirect. Redirecting to your prerender server will cause all kinds of issues, including telling Google to send users straight to your prerender server from the search results (which is really bad!).

    The rewrite rule part should be:

    RewriteRule ^(.*)$  http://example.nl:3000/http://www.example.nl/$1? [P,L]