Search code examples
pythonhtmlflaskimap

Flask IMAP application retrieving unnecessary and incorrect characters


The application uses the get_payload() method to retrieve the HTML of the message. The problem is that the retrieved HTML consists of random sequences of \r, \t and \n. Basically, the HTML does not match between the Gmail and my application.

I carefully looked at html from both Gmail and my application. The Gmail one has a <td height="32"></td> tag and nothing in it while my application has I guess just a string of useless characters like in the image below. Instead of those characters in the email, there is just blank space or nothing. Any idea why I am getting this?

Note: This happens in other emails, even with just an email with plain text.

enter image description here

The following is the code I am using in Python

import email
import email.header
import datetime
import imaplib
import sys
from pprint import pprint

imap_host = 'imap.gmail.com'
imap_user = '[email protected]'
imap_pass = 'somePassword'

diction = []


def process_mailbox(m):

    rv, data = m.search(None, "ALL")
    if rv != 'OK':
        print('No messages found!')
        return

    for num in data[0].split():
        rv, data = m.fetch(num, '(RFC822)')
        if rv != 'OK':
            print("ERROR getting message", num)
            return

        msg = email.message_from_bytes(data[0][1])
        hdr = email.header.make_header(email.header.decode_header(msg['Subject']))
        subject = str(hdr)
        print('Message %s: %s' % (num, subject))

        # date_tuple = email.utils.parsedate_tz(msg['Date'])
        # if date_tuple:
        #   local_date = datetime.datetime.fromtimestamp(email.utils.mktime_tz(date_tuple))
        #   print('Local Date:', local_date.strftime('%a, %d %b %Y %H:%m:%S'))
        for part in msg.walk():
            if part.get_content_type() == 'text/html':
                # print(part.get_payload(decode=True))
                diction.append({'body': part.get_payload(decode=True)})
    return diction


M = imaplib.IMAP4_SSL('imap.gmail.com')

try:
    rv, data = M.login(imap_user, imap_pass)
except imaplib.IMAP4.error:
    print("LOGIN FAILED!")
    sys.exit(1)

# print(rv, data)

rv, mailboxes = M.list()
if rv == 'OK':
    print('Mailboxes:')
    print(mailboxes)

rv, data = M.select('Inbox')
if rv == 'OK':
    print('Processing mailbox...\n')
    process_mailbox(M)
    M.close()
else:
    print('ERROR: Unable to open mailbox', rv)
    M.logout()

Here is the flask code:

from flask import Flask, render_template, url_for
from forms import RegistrationForm, LoginForm

import email_client


a = email_client.diction

app = Flask(__name__)


@app.route('/test')
def test():
    return render_template('test.html', text=a)


@app.route('/')
@app.route('/email')
def home():
    return render_template('home.html')


@app.route('/about')
def about():
    return render_template('about.html', title='About')


@app.route('/register')
def register():
    form = RegistrationForm()
    return render_template('register.html', title='Register', form=form)


if __name__ == '__main__':
    app.run(debug=True)

And the HTML:

{% for t  in text %}
<div class="card content-section">
    <div class="card-body">
        {{ t.body |safe}}
    </div>
</div>
{% endfor %}

Edit:

I added Markup import, and changed the the for loop that reads the body of the message to:

        for part in msg.walk():
        if part.get_content_type() == 'text/html':
            value = Markup(part.get_payload(decode=True))
            print(value)
            diction.append({'body': value})

Solution

  • I found the solution Actual Result

    part.get_payload(decode=True).decode('utf-8')
    

    will solve the problem