Search code examples
phpunicodeutf-8escapingdecoding

How to decode Unicode escape sequences like "\u00ed" to proper UTF-8 encoded characters?


Is there a function in PHP that can decode Unicode escape sequences like "\u00ed" to "í" and all other similar occurrences?

I found similar question here but is doesn't seem to work.


Solution

  • Try this:

    $str = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/', function ($match) {
        return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
    }, $str);
    

    In case it's UTF-16 based C/C++/Java/Json-style:

    $str = preg_replace_callback('/\\\\u([0-9a-fA-F]{4})/', function ($match) {
        return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UTF-16BE');
    }, $str);