Search code examples
seorobots.txtgoogle-search-console

Disallow URLs with query params in Robots.txt


My site was hacked and google crawled some weird URLs. For e.g.

www.tuppleapps.com/?andsd123
www.tuppleapps.com/?itlq7433
www.tuppleapps.com/?copz656

I want to disallow this URLs with query params but it should not affect url without params. I tried this

Disallow: /?*

But it will affect site url? or it will just disallow query params?


Solution

  • That will only disallow URLs with the question mark in them. Assuming your normal site content doesn't have query parameters, it shouldn't affect it.

    The * at the end of your rule is not needed. Your rule is 100% the same as:

    Disallow: /?
    

    Robots.txt rules without a wildcard are "starts with" rules, so it is never necessary to put the only wildcard at the end of the rule. Rules without wildcards will be understood by more bots because wildcards are only in a robots.txt extension. Most bots can't process them.

    However, I question whether disallowing these URLs in robots.txt is the correct action at all. You should make sure that there URLs return an error code (such as a 404 Not Found or 410 Gone) and then Googlebot crawl them and see that.