Search code examples
pythonutf-8character-encodingsmtpcjk

Encoding mail subject (SMTP) in Python with non-ASCII characters


I am using Python module MimeWriter to construct a message and smtplib to send a mail constructed message is:

file msg.txt:
-----------------------
Content-Type: multipart/mixed;
from: me<me@abc.com>
to: me@abc.com
subject: 主題

Content-Type: text/plain;charset=utf-8

主題

I use the code below to send a mail:

import smtplib
s=smtplib.SMTP('smtp.abc.com')
toList = ['me@abc.com']
f=open('msg.txt') #above msg in msg.txt file
msg=f.read()
f.close()
s.sendmail('me@abc.com',toList,msg)

I get mail body correctly but subject is not proper,

subject: some junk characters

主題           <- body is correct.

Please suggest? Is there any way to specify the decoding to be used for the subject also, as being specified for the body. How can I get the subject decoded correctly?


Solution

  • From http://docs.python.org/library/email.header.html

    from email.message import Message
    from email.header import Header
    msg = Message()
    msg['Subject'] = Header('主題', 'utf-8')
    print msg.as_string()
    

    Subject: =?utf-8?b?5Li76aGM?=

    more simple:

    from email.header import Header
    print Header('主題', 'utf-8').encode()
    

    =?utf-8?b?5Li76aGM?=

    as complement decode may made with:

    from email.header import decode_header
    a = decode_header("""=?utf-8?b?5Li76aGM?=""")[0]
    print(a[0].decode(a[1]))
    

    Reference: Python - email header decoding UTF-8