Search code examples
security.htaccess.htpasswd

Htaccess to use the hosting for live testing



I would use the hosting for live testing, but I want to protect access and prevent search engine indexing. For example (server directory structure) within public_html:

_private
_bin
_cnf
_log
_ ... (more default directories hosting)
testpublic
css
images
index.html


I want index.html is visibile to everyone and all other directories (except "testpublic") are hidden, protected access and search engines not to index.

The directory "testpublic" I wish it was public but may not be indexed in search engines, not sure if this is possible.

To do understand that I need 2 files .htaccess.
One general in "public_html" and other specific for "testpublic".

The .htaccess general (public_html) I think it should be something like:

AuthUserFile /home/folder../.htpasswd
AuthName "test!"
AuthType Basic
require user admin123

< FilesMatch "index.html">
Satisfy Any
< / FilesMatch>


Can anyone help me create the files with the appropriate properties? Thank you!


Solution

  • You can use a robots.txt file in your root folder. All standards-abiding robots will obey this file and not index your files and folders.

    Example Robots.txt that tells all (*) crawlers to move on and index nothing.

    User-agent: *
    Disallow: /
    

    You could use .htaccess files to fine tune what your server (assuming Apache) serves out and what directory indexes are visible. In which case you would add

    IndexIgnore *
    

    To your .htaccess file to disallow indexes.

    Updated (Credit to https://stackoverflow.com/users/1714715/samuel-cook):

    If you want to specifically stop a bot/crawler and know its USER AGENT string you can do so in your .htaccess

    <IfModule mod_rewrite.c>
      RewriteEngine on
      RewriteCond %{HTTP_USER_AGENT} Googlebot
      RewriteRule ^.* - [F,L]
    </IfModule> 
    

    Hope this helps.