I have been using the following .htaccess file in Angular apps, which avoids the 'page not found...' when user does a manual refresh on a path other than root path, or enters a url path that does not exist.
# For instructions and new versions of this Gist go to:
# https://gist.github.com/julianpoemp/bcf277cb56d2420cc53ec630a04a3566
# Version 1.4.0
<IfModule mod_rewrite.c>
RewriteEngine On
# -- REDIRECTION to https (optional):
# If you need this, uncomment the next two commands
# RewriteCond %{HTTPS} !on
# RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
# --
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(.*) index.html [NC,L]
</IfModule>
#------------ BROWSER CACHING (optional)
# Disable browser caching in production. You can add/remove file extension as you wish.
<IfModule mod_headers.c>
Header set Cache-Control "no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires 0
</IfModule>
<FilesMatch "\.(html|htm|js|json|css)$">
<IfModule mod_headers.c>
FileETag None
Header unset ETag
Header unset Pragma
Header unset Cache-Control
Header unset Last-Modified
Header set Pragma "no-cache"
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Expires "Mon, 10 Apr 1972 00:00:00 GMT"
</IfModule>
</FilesMatch>
#------------
I am using prerender.io for social media crawlers etc. They supply following .htaccess to enable redirect to the cached site on a user-agent basis. It is working and I have tested the correct meta tags are read in the facebook sharing debugger
# Change YOUR_TOKEN to your prerender token
# Change https://service.prerender.io/ (at the end of the last RewriteRule)
# to http://localhost:3000/ when testing with a local prerender server
<IfModule mod_headers.c>
RequestHeader set X-Prerender-Token "YOUR_TOKEN"
RequestHeader set X-Prerender-Version "[email protected]"
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
<IfModule mod_proxy_http.c>
RewriteCond %{HTTP_USER_AGENT} googlebot|bingbot|yandex|baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
RewriteCond %{REQUEST_URI} ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff|\.svg))
RewriteRule ^(index\.html|index\.php)?(.*) https://service.prerender.io/%{REQUEST_SCHEME}://%{HTTP_HOST}/$2 [P,END]
</IfModule>
</IfModule>
My problem arises from trying to combine rules in both .htaccess files to achieve the effects of both simultaneously. When I include:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(.*) index.html [NC,L]
the crawlers no longer get redirected to preloader.io service. If I don't include the rules above the angular app is broken in the way that it cant manually reload on a url path other than root path.
Here is a non-working .htaccess I have tried:
<IfModule mod_rewrite.c>
RewriteEngine On
# -- REDIRECTION to https (optional):
# If you need this, uncomment the next two commands
# RewriteCond %{HTTPS} !on
# RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
# --
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(.*) index.html [NC,L]
</IfModule>
<IfModule mod_headers.c>
Header set Cache-Control "no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires 0
</IfModule>
<IfModule mod_headers.c>
RequestHeader set X-Prerender-Token "MY_TOKEN_WAS_HERE"
RequestHeader set X-Prerender-Version "[email protected]"
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
<IfModule mod_proxy_http.c>
RewriteCond %{HTTP_USER_AGENT} googlebot|bingbot|yandex|baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
RewriteCond %{REQUEST_URI} ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff|\.svg))
RewriteRule ^(index\.html|index\.php)?(.*) https://service.prerender.io/%{REQUEST_SCHEME}://%{HTTP_HOST}/$2 [P,END]
</IfModule>
</IfModule>
I think I figured it out. I have put the prerender.io proxy rewrite rule to the top, and included the [L] flag in the rule, which stops apache processing further rewrite rules below in the .htaccess file... Seems to work. Maybe this will help someone so I'm posting it here
<IfModule mod_rewrite.c>
RewriteEngine On
<IfModule mod_proxy_http.c>
RewriteCond %{HTTP_USER_AGENT} googlebot|bingbot|yandex|baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest\/0\.|pinterestbot|slackbot|vkShare|W3C_Validator|whatsapp [NC,OR]
RewriteCond %{QUERY_STRING} _escaped_fragment_
RewriteCond %{REQUEST_URI} ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff|\.svg))
RewriteRule ^(index\.html|index\.php)?(.*) https://service.prerender.io/%{REQUEST_SCHEME}://%{HTTP_HOST}/$2 [P,END,L]
</IfModule>
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
# -- REDIRECTION to https (optional):
# If you need this, uncomment the next two commands
# RewriteCond %{HTTPS} !on
# RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
# --
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^(.*) index.html [NC,L]
</IfModule>
<IfModule mod_headers.c>
Header set Cache-Control "no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires 0
</IfModule>
<IfModule mod_headers.c>
RequestHeader set X-Prerender-Token "XVkAwr4MuWz9K5q9Lp4X"
RequestHeader set X-Prerender-Version "[email protected]"
</IfModule>