Search code examples
httphttp-headershttpserverhttp-accept-header

How does my browser display a pdf when it didn't specify that's something it would accept?


I'm writing a simple HTTP server that will serve content from the file system.

I'm a little confused as to how the client and server negotiate content type.

After doing some research, I found that Content-Type specifies the content type of the HTTP message being sent, while the Accept header specifies what the program expects to receive as a response.

When I visit my server from my browser, and read the initial GET request (when visited with a null URI), I get the following:

GET / HTTP/1.1
Host: 127.0.0.1:1234
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1

As you can see, the accept header doesn't specify it will accept pdfs, judging by the fact that I can't see the MIME type application/pdf in the accept header value.

Yet, when I send a pdf's bytes along with a content type set to application/pdf, the browser magically displays it.

So, what am I missing? I originally thought the browser might be doing some basic inference on the URI to see if it ends it .pdf, and then accept the corresponding MIME type.

But, when I visit it with a link to a pdf, the Accept header stays the same.

Any help would be really appreciated.


Solution

  • I'm writing a simple HTTP server

    Then you should learn to find your way around the various RFCs that describe HTTP.

    The relevant one here is RFC 7231, 5.3.2. Accept:

    If the header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server can either honor the header field by sending a 406 (Not Acceptable) response or disregard the header field by treating the response as if it is not subject to content negotiation.

    A browser in principle wants to display HTML-formatted documents, for whatever variant of (X)HTML the server is willing to serve, so by default it sends the accept header you observed.

    If the request is for another kind of resource however, the server is free to respond with that type of content.