Search code examples
httphttp-headerswebserver

http accept and content-type headers confusion


This is an example of HTTP request message transmitted to the web server. Inside headers there is an Accept header. I am confused about the meaning of it and how it is created. I thought it solely specifies my browsers capabilities to handle files. But that doesn't explain why does it differ when I visit amazon.com or joes-hardware.

There is also Content-Type header, which is a MIME for a file it requested. Same question. How does my browser know what is the type of file it requested? Is it based on the URI extension I requested or is this a generic header? This header seems to only be send in response headers. My mistake.

GET /tools.html HTTP/1.0
User-agent: Mozilla/4.75 [en] (Win98; U) Host: www.joes-hardware.com
Accept: text/html, image/gif, image/jpeg 
Accept-language: en

Solution

  • First things first: Acceptand Accept-Language are headers defined in RFC 7231, section 5.3.2 and section 5.3.5, respectively. Together with Accept-* headers, they enable content negotiation through the client. There is an excellent article regarding content engotiation on the Mozilla Development Network. (On a side-note: The MDN is an excellent starting point for research. A lot of the articles are outdated, but the concepts are still largely valid)

    The content of the Accept-Language is largely controlled by the language settings of a client's UI. Mozilla's Firefox (and - IIRC - Opera and Safari) allows to tweak these through its settings while MSIE seems to deduct them from the keyboard layouts installed in the system. There is nothing in the type of requested media that should influence this header.

    The content of the Accept header on the other hand is very much depending on the context in which a resource is being requested. E.g. if you request a resource through your browser's address bar, the Accept header will pretty much read like "give me anything I can digest." If the browser is requesting a resource through an <img/>-tag, the header is going to differ in that the browser is trying to get a presentation of the requested resource that is fit for being displayed inside that tag. Same for <video/>, <audio/>, and <script/>.

    Beyond that, I am not aware of any mechanisms effecting the Accept header. <a/>-tags have - unknownst to most - a type attribute which is carrying a MIME mediatype. This is, however, a fallback mechanism and should not alter Accept in any way.

    As for your example, I took the liberty of requesting both sites and copying the relevant request headers:

    amazon.com

    GET / HTTP/1.1
    Host: www.amazon.com
    User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: de,en-US;q=0.7,en;q=0.3
    Accept-Encoding: gzip, deflate
    DNT: 1
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache
    

    joes-hardware.com

    GET / HTTP/1.1
    Host: www.joes-hardware.com
    User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: de,en-US;q=0.7,en;q=0.3
    Accept-Encoding: gzip, deflate
    DNT: 1
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache
    

    The headers are no different when requesting /tools.html in the last example.