Search code examples
jakarta-mail

The parsing of an eml with javamail doesn't recognize properly nested messages


I'm implementing an .eml parser using Javamail 1.5.6, I've started copying from msghow.java a sample provided within javamail.

I'm testing an eml which contains as attachment another eml, this is an extract:

MIME-Version: 1.0
Date: Tue, 30 Apr 2019 16:20:45 +0200
Message-ID: <CA+fLqEW8TUfSxih9DTp2WXa63pS7wf1eZiro_9k1XS4AShN5Zg@mail.gmail.com>
Subject: Message with an eml as attachment
From: a b <[email protected]>
To: [email protected]
Content-Type: multipart/mixed; boundary="00000000000057f76c0587c01bc9"

--00000000000057f76c0587c01bc9
Content-Type: multipart/alternative; boundary="00000000000057f7670587c01bc7"

--00000000000057f7670587c01bc7
Content-Type: text/plain; charset="UTF-8"

Hello guys,

this is a simple message from a not certified account, it contains only one
attachment, an eml message

--00000000000057f7670587c01bc7
Content-Type: text/html; charset="UTF-8"

<div dir="ltr">Hello guys,<div><br></div><div>this is a simple message from a not certified account, it contains only one attachment, an eml message</div></div>
--00000000000057f76c0587c01bc9
Content-Type: message/rfc822; name="Cena zerebao.eml"
Content-Disposition: attachment; filename="Cena zerebao.eml"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_jv3vpu760
Content-ID: <f_jv3vpu760>

WC1Ob3Rlcy1JdGVtOiBGcmksIDYgSnVsIDIwMTggMTc6NDA6MDAgKzAyMDA7DQogdHlwZT00MDA7
IG5hbWU9T3JpZ2luYWxNb2RUaW1lDQpYLU5vdGVzLUl0ZW06IE1lbW87DQogbmFtZT1Gb3JtDQpY
LU5vdGVzLUl0ZW06IFN0ZE5vdGVzTHRyMjU7DQo.... and so on

Javamail recognizes that eml but when I get its subject, date, body, attachments and so on, they all are null. msghow.java itself doesn't see them.

Before javamail I implemented my parser with mime4j and I haven't this problem, but now I would like to parse emls using only javamail if possible


Solution

  • From the javadocs describing the mail.mime.allowencodedmessages property:

    The MIME spec does not allow body parts of type message/* to be encoded. The Content-Transfer-Encoding header is ignored in this case. Some versions of Microsoft Outlook will incorrectly encode message attachments. Setting this System property to "true" will cause the Content-Transfer-Encoding header to be honored for message attachments. The default value of this property is false.