Search code examples
jakarta-mailinline-images

Read embedded inline image from email as html


I tried to read email using IMAP connection. I am getting emails as html content. When i receive an email with body containing image. i am fail to get the image from email body.

The html output is follows.

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
    {font-family:Cambria;
    panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
    {font-family:Calibri;
    panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
    {font-family:Tahoma;
    panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0in;
    margin-bottom:.0001pt;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
    {mso-style-priority:99;
    color:blue;
    text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
    {mso-style-priority:99;
    color:purple;
    text-decoration:underline;}
p.MsoAcetate, li.MsoAcetate, div.MsoAcetate
    {mso-style-priority:99;
    mso-style-link:"Balloon Text Char";
    margin:0in;
    margin-bottom:.0001pt;
    font-size:8.0pt;
    font-family:"Tahoma","sans-serif";}
span.EmailStyle17
    {mso-style-type:personal-compose;
    font-family:"Calibri","sans-serif";
    color:windowtext;}
span.BalloonTextChar
    {mso-style-name:"Balloon Text Char";
    mso-style-priority:99;
    mso-style-link:"Balloon Text";
    font-family:"Tahoma","sans-serif";}
.MsoChpDefault
    {mso-style-type:export-only;
    font-family:"Calibri","sans-serif";}
@page WordSection1
    {size:8.5in 11.0in;
    margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
    {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal">The body parts <o:p></o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><i><span lang="EN-GB" style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#365F91">Regards</span></i><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#365F91">,</span></i><span lang="EN-IN" style="color:#365F91"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-GB" style="color:#365F91">&nbsp;</span><span lang="EN-IN" style="color:#365F91"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Amith K&nbsp; Bharathan</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Software Engineer</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">&nbsp;</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>

<p class="MsoNormal">
<span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">


**<img width="110" height="61" id="Picture_x0020_1" src="cid:[email protected]" alt="Description: Description: Description: Description: tstlogo">**


</span><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">TTT Software &amp; Systems India Pvt. Ltd.</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D"> Infopark, Kakkanad-682 030</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Mob&nbsp;&nbsp; :</span></i></b><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D"> &#43;91 99957</span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><b><i><span style="font-size:10.0pt;font-family:&quot;Cambria&quot;,&quot;serif&quot;;color:#17365D">Email :</span></i></b><i><span style="font-size:10.0pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:#17365D">
<a href="mailto:[email protected]"><span style="color:#17365D">[email protected]</span></a></span></i><span lang="EN-IN" style="color:#17365D"><o:p></o:p></span></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
<p class="MsoNormal"><o:p>&nbsp;</o:p></p>
</div>
</body>
</html>

my code snippet is:

private String getTextFromMimeMultipart(
            MimeMultipart mimeMultipart) throws Exception{
        String result = "";
        int count = mimeMultipart.getCount();
        System.out.println("____________START______GET MULTI PART"+count);
        for (int i = 0; i < count; i++) {
            BodyPart bodyPart = mimeMultipart.getBodyPart(i);
            if (bodyPart.isMimeType("text/plain")) {
                System.out.println("11111111111111");
                result = result + "\n" + bodyPart.getContent();
                //System.out.println("RESULT "+result);
             //   break; // without break same text appears twice in my tests
            }   if (bodyPart.isMimeType("text/html")) {
                System.out.println("2222222");
                String html = (String) bodyPart.getContent();
                System.out.println("22 bodypart "+bodyPart.getContentType());

                result = result + "\n >>> " + org.jsoup.Jsoup.parse(html).text();
            }  if (bodyPart.getContent() instanceof MimeMultipart){

                result = result + getTextFromMimeMultipart((MimeMultipart)bodyPart.getContent());
                System.out.println("3333333333333"+(MimeMultipart)bodyPart.getContent());

                Multipart multiPart = (MimeMultipart)bodyPart.getContent();
                System.out.println("multipart COUNT "+multiPart.getCount());
                for (int v = 0; v < multiPart.getCount(); v++) {
                    MimeBodyPart part = (MimeBodyPart) multiPart.getBodyPart(v);
                    System.out.println("PART ENCODING 111"+part.getEncoding()+"DISPOSITION "+part.getDisposition());
                  //  downloadFile("fl"+v, part) ;
                    if (Part.ATTACHMENT.equalsIgnoreCase(part.getDisposition())) {

                        downloadFile("fl"+v, part) ;
                    } if (Part.INLINE.equalsIgnoreCase(part.getDisposition())) {

                     System.out.println("_________________INLINE___________");
                }
                    if(part.getDisposition() == null){
                        System.out.println("INLINE FILE NAME "+part.getFileName());

                        //downloadFile("fl"+v, part) ;
                    }
                }

            }
        }
        System.out.println("____________END___________"+result);
        return result;
    }

O/P:

____________START______GET MULTI PART2

2222222

22 bodypart text/html; charset=us-ascii ____________END___________

  The body parts   Regards,   Amith K  Bharathan Software Engineer   TST Software & Systems India Pvt. Ltd. Infopark, Kakkanad-682 030 Mob   : +91 99947 Email : [email protected]     MULTI PART 2

INLINE IMAGE FILLLLL NAMEEEE null I/P com.sun.mail.imap.IMAPInputStream@75b3adecMIME


Solution

  • If you get a multipart message you go through the parts. You handle text/plain parts, text/html parts, and multipart parts. If one of the parts of the top level message is a multipart, you look for images in that sub-multipart. But you never look for images in the top level multipart. Add an "else" clause to the top level "if" statement and you'll see what you're missing.

    You've made some assumptions about the structure of a MIME message that aren't true in general.