Search code examples
node.jscharacter-encodingiconvmime-messagequoted-printable

How to fix broken Korean text


When we are sending korean email with Exchange Server it arrives with mime content-type quoted-printable and UTF8 charset and HTML tag: <meta content="text/html; charset=euc-kr" http-equiv="Content-Type"/>.
We parse emails with nodemailer;

The final korean text looks like: 하나은행 보안메일
EML QP: =ED=95=98=EB=82=98=EC=9D=80=ED=96=89 =EB=B3=B4=EC=95=88=EB=A9=94=EC=9D=BC

On the other hand, when we are sending the same email via SMTP Connector it arrives broken.
The gibberish korean text looks like: 占싹놂옙占쏙옙占쏙옙 占쏙옙占싫몌옙占쏙옙
EML QP: =E5=8D=A0=EC=8B=B9=EB=86=82=EC=98=99=E5=8D=A0=EC=8F=99=EC=98=99=E5=8D=A0=EC=8F=99=EC=98=99 =E5=8D=A0=EC=8F=99=EC=98=99=E5=8D=A0=EC=8B=AB=EB=AA=8C=EC=98=99=E5=8D=A0=EC=8F=99=EC=98=99

I assume, the problem is in some incorrect Exchange Server configuration. Maybe it decodes UTF16 as UTF8. Unfortunately we don't have access to the remote Exchange Server. The only way is to fix the broken text locally, already after it arrives.

This is an example that didn't work:

const libqp = require('libqp');
const iconv = require('iconv-lite');
let html = libqp.decode(res);
let html2 = iconv.decode(html, 'euc-kr');

UPDATE: Thanks to https://stackoverflow.com/users/3439404/josefz this issue can be reproduced: iconv.decode(iconv.encode(iconv.decode(iconv.encode('하나은행 보안메일', 'euc_kr'), 'utf_8'), 'utf_8'), 'euc-kr')

Now, we have to run it the opposite way...


Solution

  • You can't. The result is already broken, there is no way to undo it