Search code examples
pythonimaplib

Python Message' object has no attribute 'get_body


I'm trying to search email body but facing some issues:

 #!/usr/local/bin/python3
from email.message import EmailMessage
import email
import imaplib
import re
import sys
import logging
import base64
import os
logging.basicConfig(stream=sys.stdout, level=logging.INFO)

###########log in to mailbox########################
user = 'email@company.com'
pwd = 'pwd'

conn = imaplib.IMAP4_SSL("outlook.office365.com")
conn.login(user,pwd)
conn.select("test")
count = conn.select("test")

resp, items = conn.uid("search" ,None, '(OR (FROM "some@email) (FROM "some@email"))')

items = items[0].split()
for emailid in items:
    resp, data = conn.uid("fetch",emailid, "(RFC822)")
    if resp == 'OK':
        email_body = data[0][1]#.decode('utf-8')
        mail = email.message_from_bytes(email_body)

        #get all emails with words "PA1" or "PA2" in subject
        if mail["Subject"].find("PA1") > 0 or mail["Subject"].find("PA2") > 0:
           print (mail)

I have issues in following line:

body = mail.get_body(preferencelist=('plain', 'html'))

getting:

AttributeError: 'Message' object has no attribute 'get_body'


Solution

  • You should not convert the MIME structure to a string and then feed that to message_from_string. Instead, keep it as a bytes object.

    from email.policy import default as default_policy
    ...
    items = items[0].split()
    for emailid in items:
        resp, data = conn.uid("fetch",emailid, "(RFC822)")
        if resp == 'OK':
            email_blob = data[0][1]
            mail = email.message_from_bytes(email_blob, policy=default_policy)
            if not any(x in mail['subject'] for x in ('PA1', 'PA2')):
                continue
    

    You are not showing how you are traversing the MIME structure so I sort of assume you are currently not doing that at all. Probably you want something like

            # continuation for the above code
            body = mail.get_body(preferencelist=('plain', 'html'))
            for lines in body.split('\n'):
                if line.startswith('MACHINE:'):
                    result = line[8:].strip()
                    break
    

    It looks like you have an email body part encoded using Content-Transfer-Encoding: quoted-printable. The above code is robust against various encodings because the email library decodes the encapsulation transparently for you, which gets rid of any QP-escaped line breaks, like the one in your question. For the record, quoted-printable can break up a long line anywhere, including in the middle of the value you are attempting to extract, so you really do want to decode before attempting to extract anything.