Search code examples
pythonquoted-printable

Converting Latin characters to quoted-printable in python


How to convert Latin characters to quoted-printable encoding in Python?

I know about quopri but it doesn't work with Latin characters (maybe I'm doing something wrong).

Here is my code:

import quopri

fly_as_quoted_printable = b'=28=46=6C=79=29'
fly_as_bytes = quopri.decodestring(fly_as_quoted_printable)
fly_as_utf8 = fly_as_bytes.decode('utf-8')

print('\nСonverting `quoted_printable` to bytes and string is ok:')
print(f'fly_as_quoted_printable= {fly_as_quoted_printable}')
print(f'fly_as_bytes= {fly_as_bytes}')
print(f'fly_as_utf8= {fly_as_utf8}')

cyrillic_and_latin_mixed_as_bytes = bytes('Полёт (Fly)', 'utf-8')
quoted_printable = quopri.encodestring(cyrillic_and_latin_mixed_as_bytes)

print('\nBut converting latin characters as bytes to `quoted_printable` does not work:')
print(f'cyrillic_and_latin_mixed_as_bytes= {cyrillic_and_latin_mixed_as_bytes}')
print(f'quotep_printable= {quoted_printable}')

The output is:

Сonverting `quoted_printable` to bytes and string is ok:
fly_as_quoted_printable= b'=28=46=6C=79=29'
fly_as_bytes= b'(Fly)'
fly_as_utf8= (Fly)

But converting latin characters as bytes to `quoted_printable` does not work:
cyrillic_and_latin_mixed_as_bytes= b'\xd0\x9f\xd0\xbe\xd0\xbb\xd1\x91\xd1\x82 (Fly)'
quotep_printable= b'=D0=9F=D0=BE=D0=BB=D1=91=D1=82 (Fly)'

Solution

  • The stdlib quopri module does not bother to quote byte ordinals which don't need to be quoted (source).

    To also encode already printable characters, you could escape them manually, but it's probably not necessary. For "(Fly)" that would be like:

    >>> ''.join([f"={ord(c):X}" for c in "(Fly)"])
    '=28=46=6C=79=29'
    >>> quopri.decodestring(''.join([f"={ord(c):X}" for c in "(Fly)"]))
    b'(Fly)'