Search code examples
amazon-s3amazon-sesmime-message

How to properly parse MimeMessage?


We are using Amazon Simple Email Service for receiving emails that store emails on S3. The stored messages are in MIME format version 1.0 in some cases "bcc", "cc", and "to" are empty in the MIME message.

Is it safe to parse the Reciever header and get the "for" value?

Received: from [email protected] (mout.perfora.net [74.208.4.197])
 by inbound-smtp.eu-west-1.amazonaws.com with SMTP id i9345l5jt652cm6mnupeorc57rfmf7p6me31ca01
 for **[email protected]**;
 Mon, 14 Nov 2022 18:41:44 +0000 (UTC)

Solution

  • If I can imagine, the only that situation is when you use some strange email service provider (that does not use the casual standard way) and you add your email address to the BCC of the email.

    If you want to exactly get the way that the mail was sent to you I would suggest also a different solution. There is a useful In-Reply-To header, which is passed in the replies in the most common email services. By usage of this (if you add it to the sender email), then it would be passed on so you could have exactly the id of the message or just an email address if its enough.

    On the other hand if you would like to handle the situations when someone creates the email completely by themselves (and you are 100% sure that there are no “To, CC, BCC” fields, the first thing that I would do is thoroughly analyse the email headers that are there. There is other email headers that might be added such as Delivered-To, which will be for sure easier to analyse. If those are unavailable, you have no choice, however I believe that are rare cases (handling it would require a separate regex which is possible, however its kind of overengineering).

    My main advice:

    1. focus on finding other headers
    2. analyse firstly all of the headers in the email and use those - you dont have to worry about custom regexes
    3. if nothing works - you have no other choice :)