Search code examples
pythonpython-3.ximap

How can I read the mail body of a mail with Python?


Log in and read subject works. An error occurs when reading the body. What is the error? In the internet the error was always in this part : " email.message_from_bytes(data[0][1].decode())"but I think this part is correct.

# Connection settings
        HOST = 'imap.host'
        USERNAME = 'name@domain.com'
        PASSWORD = 'password'

        m = imaplib.IMAP4_SSL(HOST, 993)
        m.login(USERNAME, PASSWORD)
        m.select('INBOX')

        result, data = m.uid('search', None, "UNSEEN")
        if result == 'OK':
              for num in data[0].split()[:5]:
                    result, data = m.uid('fetch', num, '(RFC822)')
                    if result == 'OK':
                          email_message_raw = email.message_from_bytes(data[0][1])
                          email_from = str(make_header(decode_header(email_message_raw['From'])))
                          # von Edward Chapman -> https://stackoverflow.com/questions/7314942/python-imaplib-to-get-gmail-inbox-subjects-titles-and-sender-name
                          subject = str(email.header.make_header(email.header.decode_header(email_message_raw['Subject'])))
                          # content = email_message_raw.get_payload(decode=True)
                          # von Todor Minakov -> https://stackoverflow.com/questions/17874360/python-how-to-parse-the-body-from-a-raw-email-given-that-raw-email-does-not
                          b = email.message_from_string(email_message_raw)
                          body = ""

                          if b.is_multipart():
                              for part in b.walk():
                                  ctype = part.get_content_type()
                                  cdispo = str(part.get('Content-Disposition'))

                                  # skip any text/plain (txt) attachments
                                  if ctype == 'text/plain' and 'attachment' not in cdispo:
                                      body = part.get_payload(decode=True)  # decode
                                      break
                          # not multipart - i.e. plain text, no attachments, keeping fingers crossed
                          else:
                              body = b.get_payload(decode=True)
                          
        m.close()
        m.logout()


        txt = body
        regarding = subject
        print("###########################################################")
        print(regarding)
        print("###########################################################")
        print(txt)
        print("###########################################################")

Error message:

TypeError: initial_value must be str or None, not Message

Thanks for the comments and reply


Solution

  • You have everything in place. Just have to understand a few concepts.

    "email" library allows you to convert typical email bytes into an easily usable object called Message using its parser APIs, such as message_from_bytes(), message_from_string(), etc.

    The typical error is due to an input error.

    email.message_from_bytes(data[0][1].decode())

    The function above, message_from_bytes, takes bytes as an input not str. So, it is redundant to decode data[0][1] and also inputting through the parser API.

    In short, you are trying to parse the original email message twice using message_from_bytes(data[0][1]) and message_from_string(email_message_raw). Get rid of one of them and you will be all set!

    Try this approach:

        HOST = 'imap.host'
        USERNAME = 'name@domain.com'
        PASSWORD = 'password'
    
        m = imaplib.IMAP4_SSL(HOST, 993)
        m.login(USERNAME, PASSWORD)
        m.select('INBOX')
    
        result, data = m.uid('search', None, "UNSEEN")
        if result == 'OK':
              for num in data[0].split()[:5]:
                    result, data = m.uid('fetch', num, '(RFC822)')
                    if result == 'OK':
                          email_message = email.message_from_bytes(data[0][1])
                          email_from = str(make_header(decode_header(email_message_raw['From'])))
                          # von Edward Chapman -> https://stackoverflow.com/questions/7314942/python-imaplib-to-get-gmail-inbox-subjects-titles-and-sender-name
                          subject = str(email.header.make_header(email.header.decode_header(email_message_raw['Subject'])))
                          # content = email_message_raw.get_payload(decode=True)
                          # von Todor Minakov -> https://stackoverflow.com/questions/17874360/python-how-to-parse-the-body-from-a-raw-email-given-that-raw-email-does-not
                          # b = email.message_from_string(email_message_raw)
                          # this is already set as Message object which have many methods (i.e. is_multipart(), walk(), etc.)
                          b = email_message 
                          body = ""
    
                          if b.is_multipart():
                              for part in b.walk():
                                  ctype = part.get_content_type()
                                  cdispo = str(part.get('Content-Disposition'))
    
                                  # skip any text/plain (txt) attachments
                                  if ctype == 'text/plain' and 'attachment' not in cdispo:
                                      body = part.get_payload(decode=True)  # decode
                                      break
                          # not multipart - i.e. plain text, no attachments, keeping fingers crossed
                          else:
                              body = b.get_payload(decode=True)
                          
        m.close()
        m.logout()
    
    
        txt = body
        regarding = subject
        print("###########################################################")
        print(regarding)
        print("###########################################################")
        print(txt)
        print("###########################################################")