I have two conflicting sections of code. One produces:
Content-Type: text/html; name=foo_foo2.blah
Content-Disposition: attachment; filename=foo_foo2.blah
Another produces:
Content-Type: text/html; name="foo_foo2.blah"
Content-Disposition: attachment; filename="foo_foo2.blah"
The one without quotes is resulting in unexpected behavior by a receiving application. Are quotes required?
In RFC 2183 I don't see an explicit requirement:
In the extended BNF notation of [RFC 822], the Content-Disposition
header field is defined as follows:disposition := "Content-Disposition" ":" disposition-type *(";" disposition-parm) disposition-type := "inline" / "attachment" / extension-token ; values are not case-sensitive disposition-parm := filename-parm / creation-date-parm / modification-date-parm / read-date-parm / size-parm / parameter filename-parm := "filename" "=" value creation-date-parm := "creation-date" "=" quoted-date-time modification-date-parm := "modification-date" "=" quoted-date-time read-date-parm := "read-date" "=" quoted-date-time size-parm := "size" "=" 1*DIGIT quoted-date-time := quoted-string ; contents MUST be an RFC 822 `date-time' ; numeric timezones (+HHMM or -HHMM) MUST be used
Perhaps I'm blind though. Can someone please confirm?
Just below the BNF is this passage:
`Extension-token', `parameter', `tspecials' and `value' are defined according to [RFC 2045] (which references [RFC 822] in the definition of some of these tokens). `quoted-string' and `DIGIT' are defined in [RFC 822].
2045 has this definition in section 5.1 (which however describes Content-type:
):
value := token / quoted-string
token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
or tspecials>
tspecials := "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / <">
"/" / "[" / "]" / "?" / "="
; Must be in quoted-string,
; to use within parameter values
So a filename which is a token
does not need to be quoted; but if it contains any of the tspecials
(or control characters or whitespace), you need to quote it after all.
Just to specifically address the case of underscore, it is not a character which requires quoting according to the RFC (it's not control, whitespace, or enumerated as one of the tspecials
), but the way things are in the wild, you are probably better off quoting everything just in case. (Shall we call this anti-Postel? Be unduly conervative about what you transmit, and don't be too liberal in what you think you can infer about obviously invalid input.)
As a bit of an aside, MIME filenames in email are in practice completely the Wild West; a lot of popular applications simply ignore RFC2231 and use RFC2047 encoding here instead, or no encoding, or completely their own ad hoc "I thought this might work and nobody has complained" concoctions.