I wish to use Python's email
module to change the encoding of MIME mail message parts from quoted-printable
or base64
to 7bit
or 8bit
. All seems to work out, except that at the end, for some messages, email.message.as_string
encodes some parts (text/plain
and text/html
both encountered) as base64
. I do not understand why, and what to understand this behavior to avoid it.
The script code:
# Read and parse the message from stdin
msg = email.message_from_string(sys.stdin.read())
for part in msg.walk():
if part.get_content_maintype() == 'text':
if part['Content-Transfer-Encoding'] in {'quoted-printable', 'base64'}:
payload = part.get_payload(decode=True)
del part['Content-Transfer-Encoding']
part.set_payload(payload)
email.encoders.encode_7or8bit(part)
# Send the modified message to stdout
print(msg.as_string())
(If this matters: I use Python 3.3)
Use as_bytes
instead. So change your print to:
print(msg.as_bytes().decode(encoding='UTF-8'))
reason is in policy documentation https://docs.python.org/3.4/library/email.policy.html#module-email.policy
A cte_type value of 8bit only works with BytesGenerator, not Generator, because strings cannot contain binary data. If a Generator is operating under a policy that specifies cte_type=8bit, it will act as if cte_type is 7bit.
And as_string use Generator, but as_bytes use BytesGenerator which you need