Search code examples
emailheaderrfc5322

How to fold an email header with long email addresses according to RFC 5322?


Assume that an email has the following header field:

To: =?utf-8?q?Foo_Bar?= <1234567890123456789012345678901234567890123456789012345678901234@abcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghi.com>

Is there a way to fold the header

  1. in full accordance with RFC 5322
  2. such that the email is still accepted by commonly used MTAs, and
  3. no line exceeds a length of 78 characters?

I am aware that the hard limit on line length is 998 chars, but I wonder if it is possible to also fulfill all SHOULD-requirements. If I understand the Augmented Backus-Naur Form

domain-literal  =   [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS]

dtext           =   %d33-90 /          ; Printable US-ASCII
                    %d94-126 /         ;  characters not including
                    obs-dtext          ;  "[", "]", or "\"

in section 3.4.1 correctly, one can insert folding whitespace into domain literals, and the following should be valid:

To: =?utf-8?q?Foo_Bar?=
 <1234567890123456789012345678901234567890123456789012345678901234@abcdefgh
 iabcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghi.com>

However, this is rejected by recent versions of postfix and exim:

501:  <1234567890123456789012345678901234567890123456789012345678901234@abcdefgh: '>' missing at end of address

Either both MTAs are broken (which seems highly unlikely), or my interpretation of the RFC is wrong.

Addendum in case someone runs into similar problems:

Before posting the question, I actually tried to fold at @ and . as shown in the RFC-conformant example by jstedfast, but got the same error message. As it turns out, this was not the fault of the MTA but of the SMTP client library I used, which extracted the recipient addresses from the header for generating RCPT TO: commands for SMTP and failed to filter out the line breaks.


Solution

  • You are not allowed to break the domain across line boundaries in the middle of a domain (which is what you did), only before/after the domain.

    RFC5322 says you SHOULD NOT break around the @, but that doesn't mean MAY NOT.

    angle-addr      =   [CFWS] "<" addr-spec ">" [CFWS] /
                        obs-angle-addr
    
    addr-spec       =   local-part "@" domain
    
    local-part      =   dot-atom / quoted-string / obs-local-part
    
    domain          =   dot-atom / domain-literal / obs-domain
    
    domain-literal  =   [CFWS] "[" *([FWS] dtext) [FWS] "]" [CFWS]
    
    dtext           =   %d33-90 /          ; Printable US-ASCII
                        %d94-126 /         ;  characters not including
                        obs-dtext          ;  "[", "]", or "\"
    
    atext           =   ALPHA / DIGIT /    ; Printable US-ASCII
                        "!" / "#" /        ;  characters not including
                        "$" / "%" /        ;  specials.  Used for atoms.
                        "&" / "'" /
                        "*" / "+" /
                        "-" / "/" /
                        "=" / "?" /
                        "^" / "_" /
                        "`" / "{" /
                        "|" / "}" /
                        "~"
    
    atom            =   [CFWS] 1*atext [CFWS]
    
    dot-atom-text   =   1*atext *("." 1*atext)
    
    dot-atom        =   [CFWS] dot-atom-text [CFWS]
    
    specials        =   "(" / ")" /        ; Special characters that do
                        "<" / ">" /        ;  not appear in atext
                        "[" / "]" /
                        ":" / ";" /
                        "@" / "\" /
                        "," / "." /
                        DQUOTE
    

    So, if we expand the definitions and apply it to your example, what we get is (each token on its own line to avoid the need to scroll horizontally):

    [CFWS]
    "<"
    [CFWS] 
    "1234567890123456789012345678901234567890123456789012345678901234"
    [CFWS]
    "@"
    [CFWS] 
    "abcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghi"
    "."
    "com"
    [CFWS]
    ">"
    

    Wherever you see [CFWS] is where the spec technically allows you to insert a line break.

    So an example way to break your To header would be this:

    To: =?utf-8?q?Foo_Bar?=
     <
     1234567890123456789012345678901234567890123456789012345678901234
     @
     abcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghiabcdefghi.com
     >
    

    Any RFC-compliant address parser will need to handle that.

    (Self-promotion here, but MimeKit's address parser handles this ;-)