Search code examples
jakarta-mailmimemime-message

Parse MIME Body using Java Mail MimeMessage


How to parse mime body part using Java Mail MimeMessage, I am fetching BODY part alone from my IMAP server.

Command I used to fetch BODY part alone.

A001 UID FETCH 1 (UID FLAGS BODY.PEEK[1])

This will fetch raw MIME message without headers and attachments ( but contains inline images ).

When I parse the fetched content using Java Mime Message, I am getting wrong result.

For Example : If the raw MIME contains inline images part.getInputStream() return content with inline image data

Raw MIME:

------=_Part_385483_1716430164.1405422119116
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Test mail

------=_Part_385483_1716430164.1405422119116
    Content-Type: multipart/related; 
    boundary="----=_Part_385484_590068567.1405422119140"

------=_Part_385484_590068567.1405422119140
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: 7bit

 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta content="text/html;charset=UTF-8" http-equiv="Content-Type"></head><body ><div style='font-size:10pt;font-family:Verdana,Arial,Helvetica,sans-serif;'>Test mail<br><img src="cid:inline_img" style="height: 1200px; width: 1600px;"></body></html>
------=_Part_385484_590068567.1405422119140
Content-Type: image/jpeg; name=1405422097638.jpeg
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename=1405422097638.jpeg
Content-ID: <inline_img>

/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAPAAA/+4ADkFkb2JlAGTAAAAAAf/b
AIQABgQEBAUEBgUFBgkGBQYJCwgGBggLDAoKCwoKDBAMDAwMDAwQDA4PEA8ODBMTFBQTExwbGxsc
Hx8fHx8fHx8fHwEHBwcNDA0YEBAYGhURFRofHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8f
Hx8fHx8fHx8fHx8fHx8fHx8f/8AAEQgEsAZAAwERAAIRAQMRAf/EAMUAAQADAQEBAQEBAAAAAAAA
.....
------=_Part_385484_590068567.1405422119140--
------=_Part_385483_1716430164.1405422119116--

Result:

Test mail

------=_Part_385483_1716430164.1405422119116
    Content-Type: multipart/related; 
    boundary="----=_Part_385484_590068567.1405422119140"

------=_Part_385484_590068567.1405422119140
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: 7bit

 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta content="text/html;charset=UTF-8" http-equiv="Content-Type"></head><body ><div style='font-size:10pt;font-family:Verdana,Arial,Helvetica,sans-serif;'>Test mail<br><img src="cid:inline_img" style="height: 1200px; width: 1600px;"></body></html>
------=_Part_385484_590068567.1405422119140
Content-Type: image/jpeg; name=1405422097638.jpeg
Content-Transfer-Encoding: base64
Content-Disposition: inline; filename=1405422097638.jpeg
Content-ID: <inline_img>

/9j/4AAQSkZJRgABAgAAZABkAAD/7AARRHVja3kAAQAEAAAAPAAA/+4ADkFkb2JlAGTAAAAAAf/b
AIQABgQEBAUEBgUFBgkGBQYJCwgGBggLDAoKCwoKDBAMDAwMDAwQDA4PEA8ODBMTFBQTExwbGxsc
Hx8fHx8fHx8fHwEHBwcNDA0YEBAYGhURFRofHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8fHx8f
Hx8fHx8fHx8fHx8fHx8fHx8f/8AAEQgEsAZAAwERAAIRAQMRAf/EAMUAAQADAQEBAQEBAAAAAAAA
.....
------=_Part_385484_590068567.1405422119140--
------=_Part_385483_1716430164.1405422119116--

Can any one suggest how to parse body using Java MimeMessage.

Thanks.


Solution

  • You are getting just the .TEXT portion (i.e. the content of the part), but you need to combine the .MIME and the .TEXT before you'll be able to parse it.

    You can see how I do this in my own IMAP library (written in C#, though) in the GetBodyPart method of ImapFolder.cs:

    https://github.com/jstedfast/MailKit/blob/master/MailKit/Net/Imap/ImapFolder.cs#L3729

    Effectively I request <part-spec>.MIME and <part-spec>.TEXT and then chain them together in a custom ChainedStream class that takes a list of streams and reads from them as if they were a single, sequential, stream.