Search code examples
pythonhtmlemailimaplib

Python search imap email for a string


New to python, having some trouble getting past this.
Am getting back emails from gmail via imap (with starter code from https://yuji.wordpress.com/2011/06/22/python-imaplib-imap-example-with-gmail/) and want to search a specific email (which I am able to fetch) for a specific string. Something like this

ids = data[0]
id_list = ids.split()
ids = data[0]
id_list = ids.split()
latest_email_id = id_list[-1]
result, data = mail.fetch(latest_email_id, "(RFC822)") 
raw_email = data[0][1] 

def search_raw():
    if 'gave' in raw_email:
        done = 'yes'
    else:
        done = 'no'

and it always sets done to no. Here's the output for the email (for the body section of the email)

Content-Type multipart/related;boundary=1_56D8EAE1_29AD7EA0;type="text/html"
--1_56D8EAE1_29AD7EA0
Content-Type text/html;charset="UTF-8"
Content-Transfer-Encoding base64

PEhUTUw+CiAgICAgICAgPEhFQUQ+CiAgICAgICAgICAgICAgICA8VElUTEU+PC9USVRMRT4KICAg
ICAgICA8L0hFQUQ+CiAgICAgICAgPEJPRFk+CiAgICAgICAgICAgICAgICA8UCBhbGlnbj0ibGVm
dCI+PEZPTlQgZmFjZT0iVmVyZGFuYSIgY29sb3I9IiNjYzAwMDAiIHNpemU9IjIiPlNlbnQgZnJv
bSBteSBtb2JpbGUuCiAgICAgICAgICAgICAgICA8QlI+X19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXzwvRk9OVD48L1A+CgogICAgICAg
ICAgICAgICAgPFBSRT4KR2F2ZQoKPC9QUkU+CiAgICAgICAgPC9CT0RZPgo8L0hUTUw+Cg==
--1_56D8EAE1_29AD7EA0--

I know the issue is the html, but can't seem to figure out how to parse the email properly.

Thank you!


Solution

  • The text above is base64 encoding. Python has a module named base64 which gives you the ability to decode it.

    import base64
    import re
    
    
    def has_gave(raw_email):
        email_body = base64.b64decode(raw_email)
        match = re.search(r'.*gave.*', email_body , re.IGNORECASE)
        if match:
            done = 'yes'
            print 'match found for word ', match.group()
        else:
            done = 'no'
            print 'no match found'
    
        return done