I am currently editing my Robots.txt which looks like this:
User-agent: *
Disallow: /adm/*
Disallow: /download/*
Disallow: /cache
Disallow: /files
Disallow: /viewforum.php?f=146
Disallow: /ucp.php
Disallow: /mcp.php
Disallow: /memberlist.php
Disallow: /config.php
Disallow: /cron.php
Disallow: /faq.php
Disallow: /report.php
Sitemap: http://www.website.com/app.php/sitemap.xml
I am wondering how to correctly do a few things however.
1) Would this correctly block search engines from accessing a forum area?
Disallow: /viewforum.php?f=146
I wanted one area hidden from search engines but the rest of the forum areas fully readable as normal.
2) How do you block access to the internal PHPBB folders and keep search engines out out admin? are these rules correct?
Disallow: /adm/*
Disallow: /download/*
3) Do the rules for php files work correctly?
Disallow: /ucp.php
Also is there anything else i should know or do?
The line
Disallow: /viewforum.php?f=146
disallows crawling of URLs whose paths start with /viewforum.php?f=146
.
So URLs like these would not allowed to be crawled:
http://example.com/viewforum.php?f=146
http://example.com/viewforum.php?f=1461
http://example.com/viewforum.php?f=146a
http://example.com/viewforum.php?f=146/foo
http://example.com/viewforum.php?f=146&bar
(It works the same for /ucp.php
, /adm/
, and /download/
, of course. Note that this means that the appeneded *
is not needed, unless it’s actually part of the URL.)
So if the forum overview is at http://example.com/viewforum.php?f=146
, it will be blocked. However, note that it might be the case that the same page is accessible from a different URL in addition, e.g. something like: http://example.com/viewforum.php?someOtherParameter&f=146
Also note that this will not necessarily block crawling of forum threads in that forum area (because they typically don’t start with this path). While conforming bots won’t crawl this forum area page, they might find links to the threads from some other place.