Search code examples
delphic++builderindyemail-address

Indy10 - Encoded words in email address


I have encountered some emails with encoded words in email address e.g. instead of

abc <[email protected]>

it contains:

abc <=?ISO8859-1?B?YWJjQGV4YW1wbGUuY29t=?=>

I've seen that many email programs have issues with it but a few don't which makes me think it might be a part of some RFC although I cannot find it.

Additionally, if email address header is in the form:

From: =?ISO8859-1?B?YWJjQGV4YW1wbGUuY29t=?=

It will be decoded by many programs but not by Indy. What happens in most programs is that they consider that part a From "name" part and decode it as such but they leave email address encoded, resulting in:

Name = [email protected] Email = =?ISO8859-1?B?YWJjQGV4YW1wbGUuY29t=?=

Which makes it at least partially well decoded.

In Indy however this results in:

Name = **blank** Email = =?ISO8859-1?B?YWJjQGV4YW1wbGUuY29t=?=

Should this be supported in Indy (or reversed so it considers the base64 part as "name" part, not as "email" part) or is it an incorrectly formatted email address? Or is it a matter of interpretation what should be interpreted as first part as the email can indeed look like From: [email protected] without the <> characters.


Solution

  • abc <=?ISO8859-1?B?YWJjQGV4YW1wbGUuY29t=?=>

    Indy does not support such encoded addresses, per RFC 2047 Section 5 Use of encoded-words in message headers:

    An 'encoded-word' may appear in a message header or body part header according to the following rules:

    (1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822) in any Subject or Comments header field, any extension message header field, or any MIME body part field for which the field body is defined as '*text'. An 'encoded-word' may also appear in any user-defined ("X-") message or body part header field.

    Ordinary ASCII text and 'encoded-word's may appear together in the same header field. However, an 'encoded-word' that appears in a header field defined as '*text' MUST be separated from any adjacent 'encoded-word' or 'text' by 'linear-white-space'.

    (2) An 'encoded-word' may appear within a 'comment' delimited by "(" and ")", i.e., wherever a 'ctext' is allowed. More precisely, the RFC 822 ABNF definition for 'comment' is amended as follows:

    comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")"

    A "Q"-encoded 'encoded-word' which appears in a 'comment' MUST NOT contain the characters "(", ")" or " 'encoded-word' that appears in a 'comment' MUST be separated from any adjacent 'encoded-word' or 'ctext' by 'linear-white-space'.

    It is important to note that 'comment's are only recognized inside "structured" field bodies. In fields whose bodies are defined as '*text', "(" and ")" are treated as ordinary characters rather than comment delimiters, and rule (1) of this section applies. (See RFC 822, sections 3.1.2 and 3.1.3)

    (3) As a replacement for a 'word' entity within a 'phrase', for example, one that precedes an address in a From, To, or Cc header. The ABNF definition for 'phrase' from RFC 822 thus becomes:

    phrase = 1*( encoded-word / word )

    In this case the set of characters that may be used in a "Q"-encoded 'encoded-word' is restricted to: <upper and lower case ASCII letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_" (underscore, ASCII 95.)>. An 'encoded-word' that appears within a 'phrase' MUST be separated from any adjacent 'word', 'text' or 'special' by 'linear-white-space'.

    These are the ONLY locations where an 'encoded-word' may appear. In particular:

    • An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'.

    • An 'encoded-word' MUST NOT appear within a 'quoted-string'.

    • An 'encoded-word' MUST NOT be used in a Received header field.

    • An 'encoded-word' MUST NOT be used in parameter of a MIME Content-Type or Content-Disposition field, or in any structured field body except within a 'comment' or 'phrase'.

    Indy (and RFC 2047) does support encoded names, though:

    From: =?ISO8859-1?B?YWJj?= <[email protected]>
    

    From: =?ISO8859-1?B?YWJjQGV4YW1wbGUuY29t=?=

    In this case, Indy interprets this as an email address without a name. And as above, encoded addresses are not allowed.