Search code examples
browserhttp-headersspecifications

How to encode the filename parameter of Content-Disposition header in HTTP?


Web applications that want to force a resource to be downloaded rather than directly rendered in a Web browser issue a Content-Disposition header in the HTTP response of the form:

Content-Disposition: attachment; filename=FILENAME

The filename parameter can be used to suggest a name for the file into which the resource is downloaded by the browser. RFC 2183 (Content-Disposition), however, states in section 2.3 (The Filename Parameter) that the file name can only use US-ASCII characters:

Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII. We recognize the great desirability of allowing arbitrary character sets in filenames, but it is beyond the scope of this document to define the necessary mechanisms.

There is empirical evidence, nevertheless, that most popular Web browsers today seem to permit non-US-ASCII characters yet (for the lack of a standard) disagree on the encoding scheme and character set specification of the file name. Question is then, what are the various schemes and encodings employed by the popular browsers if the file name “naïvefile” (without quotes and where the third letter is U+00EF) needed to be encoded into the Content-Disposition header?

For the purpose of this question, popular browsers being:

  • Google Chrome
  • Safari
  • Internet Explorer or Edge
  • Firefox
  • Opera

Solution

  • There is discussion of this, including links to browser testing and backwards compatibility, in the proposed RFC 5987, "Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters."

    RFC 2183 indicates that such headers should be encoded according to RFC 2184, which was obsoleted by RFC 2231, covered by the draft RFC above.