I have a site with the URL https://example.com/file.php
. I don't use URL friendly, frameworks, etc. But I see that google take duplicate content from my website, but with URL that not exist like:
https://example.com/file.php/file2.php
https://example.com/file.php/file3.php
https://example.com/file.php/file3.php/hihi/other/other2.php (status 200)
But that URLs do not exist. In both cases show me the content from file.php
. I delete my .htaccess
because I think I have some bad rule, but is not that.
As @Quentin has already pointed out - this is the default for PHP. Or, more specifically, the Apache handler that processes PHP, allows path-info (additional pathname information on the URL) by default. Plain text/html
files do not allow path-info, unless explicitly enabled.
For example, given the following URL:
https://example.com/file.php/<anything>
Where file.php
is a physical file on the filesystem, then /<anything>
is the additional pathname information. And is available to PHP through the $_SERVER['PATH_INFO']
variable.
However, you can disable this in .htaccess
with the AcceptPathInfo
directive:
AcceptPathInfo Off
Now any URL that contains path-info will trigger a 404 Not Found.