I have code that needs to support Japanese. Internally my data is all UTF-8, but one rarely-used routine exports a text file for import into PowerPoint. For old versions of PowerPoint, the required encoding was Shift-JIS, and mb_convert_encoding($output, "SJIS")
worked just fine for many years. But now I've discovered that from Office 2016 onward, the encoding needs to be UTF-16 LE (Microsoft just has to be different...sigh!). Fine, I thought, I'll just change the expression to mb_convert_encoding($output, "UTF-16LE")
. But whatever PHP is doing, the resulting file is not recognized as being Unicode at all (and of course looks horrid). Notepad++ thinks it's "GB2312 (Simplified)" and even thinks the line endings are CR only, even though they are definitely CRLF. Anyone have a guess as to why it doesn't work?
You are most probably missing the Byte Order Mark, which is used to indicate, well, the byte order in UTF-16 strings.
I struggled to find software that would consume UTF-16, but in the end I just saved the contents to a .txt
file and opened it using macOs TextEdit/QuickLook.
<?php
$output = "\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e"; // 日本語
$bom = "\xFF\xFE"; // "\xFF\xFF" would indicate BE
$utf16 = $bom . mb_convert_encoding($output, "UTF-16LE");