Search code examples
phpencodingiconv

Convert UTF-16LE to UTF-8 in php


I use iconv php function but some characters doesn't convert correctly:

...
$s = iconv('UTF-16', 'UTF-8', $s);
...
$s = iconv('UTF-16//IGNORE', 'UTF-8', $s);
...
$s = iconv('UTF-16LE', 'UTF-8', $s);
...
$s = iconv('UTF-16LE//IGNORE', 'UTF-8', $s);
...

I also try mb_convert_encoding function but can't solve my problem.

A sample text file: 9px.ir/utf8-16LE.rar


Solution

  • iconv supports the UTF-16LE encoding.

    You can use it to transpose the encoding from UTF-16LE to UTF-8:

    $result = iconv($in_charset = 'UTF-16LE' , $out_charset = 'UTF-8' , $str);
    if (false === $result)
    {
        throw new Exception('Input string could not be converted.');
    }
    

    See iconvDocs.

    I'm just wondering if all code-points available in UTF-16LE are available in UTF-8. But I assume that this should fit in your case.


    Edit: I was not able to reproduce the problem on a box of my own, but on another box I ran into this notice:

    Notice: iconv() [function.iconv]: Wrong charset, conversion from UTF-16LE' toUTF-8' is not allowed in ...

    Looks like that not all iconv versions can actually convert UTF-16LE to UTF-8.

    It might be a workaround to use mb_convert_encodingDocs instead, at least it was in this case (Demo):

    $result = mb_convert_encoding($str , 'UTF-8' , 'UTF-16LE');