Search code examples
javascriptweb-crawlerbotsgooglebotgoogle-crawlers

Prevent bots from crawling dynamic javascript files


I need to prevent bots from crawling .js files. As you know Google is able to crawl .js files. There is only one .js file but it will change with new deployments and updates.

For example:

<script type="text/javascript" src="/7c2af7d5829e81965805cc932aeacdea8049891f.js?js_resource=true"></script>

I want to make sure, since I don't know how to verify this, that this is correct:

// robots.txt
Disallow: /*.js$

Also, is this the same if the .js file served through cdn?


Solution

  • # robots.txt
    Disallow: /*.js?js_resource
    

    This works fine. You can test your robots.txt in Google Search Console AKA Google Webmaster tools.