Good afternoon, I hope you can help me, I have a question:
I have a server with godaddy (delux-sharing), on this server I have the following:
/
--/mail
--/etc
--/public_html
----/web1
----/web2
------/index.php
------/css
------/img
------/js
----/web3
--/tmp
I am creating a robot.txt file in which I want NO indexing anything in web2, refieron me with nothing to everything (index.html, css, img, js), but YES want to index the other pages (web1, web3) how can I accomplish this? robot.txt file in that folder have to go? in /, /public_html, /web2?
I could help with the file contents: robot.txt?
Thank you very much in advance.
You'll use two different robots.txt files. One goes into /web1 and the other goes into /web2. As far as crawlers go, if /web1 is the root of 'somedomain.com' they will not be able to crawl up a folder and on to the /web2 folder (or any other folder on the same level).
Edit: Some sample robots.txt files
To exclude all robots from the entire server (where "server" == "domain")
User-agent: *
Disallow: /
To allow all robots complete access
User-agent: *
Disallow:
(or just create an empty "/robots.txt" file, or don't use one at all)
To exclude all robots from part of the server
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
To exclude a single robot
User-agent: BadBot
Disallow: /
To allow a single robot
User-agent: Google
Disallow:
User-agent: *
Disallow: /
So, /web2
would get a robots.txt file with:
User-agent: *
Disallow: /
and /web1
and /web3
would get empty robot.txt files or
User-agent: *
Disallow: