Search code examples
httpbrowserhttp-status-code-404

Server returns 404 for a web page, but page is showing fine in browser - why?


A strange web page crossed my way. (And being a developer I have to solve the mystery.)

When accessing the web page in any browser, all seems normal. The web page is displayed as expected.

But when looking in the console the server acually returns a 404 status code:

enter image description here

So why is the browser rendering a page?

Looking at the Body shows valid HTML is returned:

enter image description here

Hold on. Responding 404 and sending the HTML along the way? And the browser renders it??

Why is this happening? Is this some server misconfiguration? Or is something clever going on here that I don't understand? Is there a practical reason for configuring a server on purpose to behave like this?


Solution

  • Another answer on Stack Overflow contains some interesting information: A HTTP status code of 404 plus HTML response body is actually recommended by the spec.

    The 4xx class of status code is intended for cases in which the
    client seems to have erred. Except when responding to a HEAD
    request, the server SHOULD include a representation containing an
    explanation of the error situation
    , and whether it is a temporary or
    permanent condition. These status codes are applicable to any
    request method. User agents SHOULD display any included
    representation to the user.

    This leaves me with two possible explanations:

    Explanation 1: it's a server error.

    • the server wrongly returns a 404 status code
    • the browser thinks the response body contains details about the error and displays it - for the end user this is the actual page

    Explanation 2: it's done on purpose to defeat crawlers and page watchers.

    • the server returns 404 on purpose - non-browser user agents won't process the result as they interpret it as error
    • browsers are unaffected, the end user doesn't care as long as the page is being displayed

    The second one would indeed be kind of clever if you don't want your page to be indexed.