Search code examples
pythonemailimapclient

Getting rid of certain text from the body of an email using Python


I'm trying to parse the body of a forwarded email using the following Python code

import imapclient
import os
import pprint
import pyzmail
import email

#my email info
EMAIL_ADRESS = os.environ.get('DB_USER')
EMAIL_PASSWORD = os.environ.get('PYTHON_PASS')

#login to my email
imap0bj =  imapclient.IMAPClient('imap.gmail.com', ssl = True)
imap0bj.login(EMAIL_ADRESS, EMAIL_PASSWORD )
print("ok")


pprint.pprint(imap0bj.list_folders())
#Selecting my Inbox
imap0bj.select_folder('INBOX', readonly = True)

#Getting UIDs from Inbox
UIDs = imap0bj.search(['SUBJECT', 'Contact FB Applicant', 'ON', '16-Oct-2020'])
print(UIDs)


rawMessages = imap0bj.fetch(UIDs, ['BODY[]'])
message = pyzmail.PyzMessage.factory(rawMessages[9999][b'BODY[]'])

message.text_part != None
#Body of the email returned as a string
msg = message.text_part.get_payload().decode(message.text_part.charset)

print(msg)

imap0bj.logout()

This code outputs a string similar to this

   ---------- Forwarded message ---------
    From: Someone <[email protected]>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd: 🌟 Contact FB Applicant🌟
    To: <[email protected]>
    
    
    
    
   ---------- Forwarded message ---------
    From: Someone <[email protected]>
    Date: Wed, Oct 14, 2020 at 1:23 PM
    Subject: Fwd: 🌟 Contact FB Applicant🌟
    To: <[email protected]>
    
    
    The following applicant filled out the form via Facebook.  Contact
    immediately.
    
    Some Guy
    999999999999
    [email protected]

But I don't want the "Forwarded message" parts. I just want it from "The following applicant..." and onwards which is the info I care about. How do I get rid of the other stuff? I'd really appreciate the help. Thank you!


Solution

  • You can use io.StringIO

    Here's how you would use it.

    from io import StringIO
    
    # your code goes here
    ...
    ...
    
    msg = message.text_part.get_payload().decode(message.text_part.charset)
    
    sio = StringIO(msg)
    
    sio.seek(msg.index('The following applicant'))
    
    for line in sio:
      print(line)
    

    How it works:

    StringIO allows you to treat your string as a stream (file). StringIO.seek moves streams position to a particular place. (0 is the beginning of the stream) str.index returns 1st location of a string within a string. Putting it all together: you move the beginning of the stream to the 1st occurrence of the string you want, and then just read from the stream.