Search code examples
facebook.htaccesshttp-redirectscraper

E107 redirect facebook scraping error


Here's my .htaccess file:

RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} !^/?(usersettings\.php|page\.php|news\.php|signup\.php|admin/|plugins/forum/|plugins/.*/.*config\.php)
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?lf1medsoc\.com [NC]
RewriteRule .* - [F,L]

# 2. Redirect all access to the following user agents and files
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.76\ \[ru\]\ \(X11;\ U;\ SunOS\ 5\.7\ sun4u\) [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5.0$ [OR]
RewriteCond %{HTTP_USER_AGENT} (Bot\ Search|kangen|CaSpEr|MaMa|crew|plaNETWORK|dex|perl\ post$) [NC,OR]
RewriteCond %{REQUEST_URI} (contact\.php|help_us\.php|forum_index\.php|crossdomain\.xml|\.htaccess)
RewriteRule .* http://%{REMOTE_ADDR}/ [R,L]

# 3. Deny access to requests with contact.php or help_us.php in the query
# string, UNLESS those are referred from our own site (e.g. search)
RewriteCond %{QUERY_STRING} (contact\.php|request\.php\help_us\.php|casper)
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?lf1medsoc\.com [NC]
RewriteRule .* - [F,L]

# 4. Redirect empty user agent, UNLESS it's accessing the RSS feed
RewriteCond %{HTTP_USER_AGENT} ^$ 
RewriteCond %{REQUEST_URI} !^/?e107_plugins/rss_menu/rss.php
RewriteRule .* http://%{REMOTE_ADDR}/ [R,L]

# 5. Deny access to these files UNLESS referred from our site.
RewriteCond %{REQUEST_URI} ^/?(top|download|user|search|submitnews|fpw)\.php
RewriteCond %{HTTP_REFERER} !^http://(.*\.)?lf1medsoc\.com [NC]
RewriteRule .* - [F]

Facebook linter results for http://www.lf1medsoc.com/page.php?19 (publicly accessible, no logins needed,etc):

(WHOLE result page)

Scrape Information

Response Code: 200 Fetched URL: http://www.lf1medsoc.com/page.php?19 Canonical URL: http://www.lf1medsoc.com/ Final URL: http://www.lf1medsoc.com/page.php?2 Errors That Must Be Fixed

Circular Redirect Path:Circular redirect path detected (see 'Redirect Path' section for details).

Redirect Path

original: http://www.lf1medsoc.com/page.php?19 og:url: http://www.lf1medsoc.com/ 302: http://www.lf1medsoc.com/page.php?2 og:url: http://www.lf1medsoc.com/ Final URL is in bold (this is the URL we tried to extract metadata from). URLs that are part of the circular redirect path are highlighted.

Am i missing something in the htaccess? Is there a way to add the facebook useragent to be allowed to acce

Note: page.php?2 is the homepage (redirects from lf1medsoc.com --> index.php --> page.php?2)


Solution

  • Your og:url tag on http://www.lf1medsoc.com/page.php?2 is pointing to http://www.lf1medsoc.com/

    Change it to http://www.lf1medsoc.com/page.php?2