Search code examples
github-pagesrobots.txt

What is the sense of using robots.txt in GitHub pages?


I know that the file robots.txt is used for to block web crawler of index content sites of third parties.

However, if the goal this file is to delimit a private area of site or to protect a private area, which is the sense in try to hidden the content with robots.txt, if all will can be see in GitHub repository?

My question extend the examples using custom domain.

Is there motivation in to use file robots.txt inside of GitHub pages? Yes or no? And why?

Alternative 1
For that content stay effectively hidden, then will been need to pay for the web site is to get a private repository.


Solution

  • The intention of robots.txt is not to delimit private areas, because robots don't even have access to them. Instead it's in case you have some garbage or whatever miscellaneous that you don't want to be indexed by search engines or so.

    Say for example. I write Flash games for entertainment and I use GitHub Pages to allow the games to check for updates. I have this file hosted on my GHP, all of whose content is

    10579
    2.2.3
    https://github.com/iBug/SpaceRider/tree/master/SpaceRider%202
    

    It contains three pieces of information: internal number of new version, display name of new version, and download link. Therefore it is surely useless when indexed by crawlers, so when I have a robots.txt that's a kind of stuff I would keep away from being indexed.