New to python, having some trouble getting past this.
Am getting back emails from gmail via imap (with starter code from https://yuji.wordpress.com/2011/06/22/python-imaplib-imap-example-with-gmail/) and want to search a specific email (which I am able to fetch) for a specific string. Something like this
ids = data[0]
id_list = ids.split()
ids = data[0]
id_list = ids.split()
latest_email_id = id_list[-1]
result, data = mail.fetch(latest_email_id, "(RFC822)")
raw_email = data[0][1]
def search_raw():
if 'gave' in raw_email:
done = 'yes'
else:
done = 'no'
and it always sets done to no. Here's the output for the email (for the body section of the email)
Content-Type multipart/related;boundary=1_56D8EAE1_29AD7EA0;type="text/html"
--1_56D8EAE1_29AD7EA0
Content-Type text/html;charset="UTF-8"
Content-Transfer-Encoding base64
PEhUTUw+CiAgICAgICAgPEhFQUQ+CiAgICAgICAgICAgICAgICA8VElUTEU+PC9USVRMRT4KICAg
ICAgICA8L0hFQUQ+CiAgICAgICAgPEJPRFk+CiAgICAgICAgICAgICAgICA8UCBhbGlnbj0ibGVm
dCI+PEZPTlQgZmFjZT0iVmVyZGFuYSIgY29sb3I9IiNjYzAwMDAiIHNpemU9IjIiPlNlbnQgZnJv
bSBteSBtb2JpbGUuCiAgICAgICAgICAgICAgICA8QlI+X19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXzwvRk9OVD48L1A+CgogICAgICAg
ICAgICAgICAgPFBSRT4KR2F2ZQoKPC9QUkU+CiAgICAgICAgPC9CT0RZPgo8L0hUTUw+Cg==
--1_56D8EAE1_29AD7EA0--
I know the issue is the html, but can't seem to figure out how to parse the email properly.
Thank you!
The text above is base64 encoding. Python has a module named base64 which gives you the ability to decode it.
import base64
import re
def has_gave(raw_email):
email_body = base64.b64decode(raw_email)
match = re.search(r'.*gave.*', email_body , re.IGNORECASE)
if match:
done = 'yes'
print 'match found for word ', match.group()
else:
done = 'no'
print 'no match found'
return done